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^ (57) Abstract: The present invention concerns a fusion protein comprising a recombinase protein, preferably the site-specific DNA 
^ recombinase C31-Int of phage (C31, and a peptide sequence which directs the nuclear uptake of the fusion protein in eucaryotic 

cells, and the use of this fusion protein to recombine, invert or delete DNA molecules containing recognition sequences for said 
^ recombinase in eucaryotic cells at high efficiency. In addition the invention relates to a cell, preferably a mammalian cell which 

contains recognition sequences for said recombinase in its genome and wherein the genome is recombined by the action of said 
Q fusion protein. Moreover, the invention relates to the use of said cell to study the function of genes and for the creation of transgenic 
^ organisms to study gene function at various developmental stages, including the adult. In conclusion, the present invention provides 
^ a process which enables the highly efficient modification of the genome of mammalian cells by site-specific recombination. 
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Modified Recombinase 

The present invention concerns a fusion protein comprising a recombinase 
protein, preferably the site-specific DNA recombinase C3Mnt of phage <DC31, 
and a peptide sequence which directs the nuclear uptake of the fusion protein in 
eucaryotic cells, and the use of this fusion protein to recombine, invert or delete 
DNA molecules containing recognition sequences for said recombinase in 
eucaryotic cells at high efficiency. In addition the invention relates to a cell, 
preferably a mammalian cell which contains recognition sequences for said 
recombinase in its genome and wherein the genome is recombined by the action 
of said fusion protein. Moreover, the invention relates to the use of said cell to 
study the function of genes and for the creation of transgenic organisms to study 
gene function at various developmental stages, including the adult. In conclusion, 
the present invention provides a process which enables the highly efficient 
modification of the genome of mammalian cells by site-specific recombination. 



Background of the invention 

The controlled and permanent modification of the genome of eucaryotic cells and 
organisms is an important method for research applications, e.g. for studying 
gene function, for medical applications like gene therapy and the creation of 
disease models and for the design of economically important animals and crops. 
The basic methods for genome manipulations by the engineering of endogenous 
genes through gene targeting in murine embryonic stem (ES) cells are well 
established and used since many years (Capecchi, Trends in Genetics, 5, 70-76 
(1989)). Since ES cells can pass mutations induced in vitro to transgenic 
offspring in vivo it is possible to analyse the consequences of gene disruption in 
the context of the entire organism. Thus, numerous mouse strains with 
functionally inactivated genes ("knock-out mice") have been created by this 
technology and utilised to study the biological function of a variety of genes 
(Koller et al., Ann. Rev. Immunol., 10, 705 - 730 (1992)). More importantly, 
mouse mutants created by this procedure (also known as "conventional, 
complete or classical mutants"), contain the inactivated gene in all cells and 
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tissues throughout life. Thus, classical mouse mutants represent the best animal 
model for inherited human diseases as the mutation is introduced into the 
germJine but are not the optimal model to study gene function in adults, e.g. to 
validate potential drug target genes. 
5 A refined method of targeted mutagenesis, referred to as conditional 
mutagenesis, employs the Cre/loxP site-specific recombination system which 
enables the temporally and/or spatially restricted inactivation of target genes in 
cells or mice (Rajewsky et al., J. Clin. Invest, 98, 600 - 603 (1996)). The phage 
PI derived Cre recombinase recognises a 34 bp sequence referred to as loxP site 
10 which is structured as an inverted repeat of 13 bp separated by an asymmetric 8 
bp sequence which defines the direction of the loxP site. If two loxP sites are 
located on a DNA molecule in the same orientation the intervening DNA sequence 
is excised by Cre recombinase from the parental molecule as a closed circle 
leaving one loxP site on each of the reaction products (Kilby et al., TIG, 9, 413- 
15 421 (1993)). The creation of conditional mouse mutants initially requires the 
generation of two mouse strains, one containing two or more Cre recombinase 
recognition (loxP) sites in its genome while the other harbours a Cre transgene. 
The former strain is generated by homologous recombination in ES cells as 
described above, except that the exon(s) of the target gene is (are) flanked by 
20 two loxP sites which reside in introns and do not interfere with gene expression. 
The Cre transgenic strain contains a transgene whose expression is either 
constitutively active in certain cells and tissues or is inducible by external agents, 
depending on the promoter region used. Crossing of the loxP-flanked mouse 
strain with the Cre recombinase expressing strain enables the deletion of the 
25 loxP-flanked exons in the genome of doubly transgenic offspring in a prespecified 
temporally and/or spatially restricted manner. Thus, the method allows the 
analysis of gene function in particular cell types and tissues of otherwise widely 
expressed genes. Moreover, it enables the analysis of gene function in the adult 
organism by circumventing embryonic lethality which is often the consequence of 
30 complete (germline) gene inactivation. For pharmaceutical research, aiming to 
validate the utility of genes and their products for drug development, gene 
. inactivation which is inducible in adults provides an excellent genetic tool as this 
mimicks the biological effects of target inhibition upon drug application. 
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Since the first description of the concept of conditional gene targeting using the 
Cre/loxP system in mice in 1994 (Gu et al., Science 265, 103-106 (1994)) this 
method became increasingly popular among the research community and 
resulted in a broad collection of genetic tools for biological research in the mouse. 
5 More than 30 Cre transgenic mouse strains with various tissue specificities for 
gene inactivation have been created, including several "deleter" strains which 
allow to remove the loxP-flanked target gene segment in the male or female 
germline (Cohen-Tannoudji et al., Mol. Hum. Reprod. 4, 929-938 (1998); 
Metzger et al., Curr. Op. Biotech., 10, 470-476 (1999)). The need to characterise 

10 the expression pattern of Cre mediated recombination in newly generated strains 
stimulated the construction of a number of "Cre-reporter" strains which harbour 
a silent reporter gene the expression of which is activated upon Cre-mediated 
deletion (Nagy, Genesis, 26, 99-109 (2000)). Conditional mouse mutants have 
been reported for about 20 different genes, many of them could not be studied in 

15 adults as their complete inactivation leads to embryonic lethality (Cohen- 
Tannoudji et al., Mol. Hum. Reprod. 4, 929-938 (1998)). 

Great efforts have also been made to control the expression of Cre recombinase 
in an inducible fashion in mice. After the first demonstration that inducible gene 

20 knock-out is feasible in adult mice using an interferon controlled promoter (Kuhn 
et al., Science, 269, 1427-1429 (1995)), mainly two methods were applied to 
control the activity of Cre recombinase. First, it has been demonstrated that the 
fusion of Cre with the ligand binding domain of a mutant estrogen receptor allows 
to control recombinase activity by a specific steroid-like inducer. Several 

25 transgenic mouse strains expressing such a fusion protein have been generated 
and allow to induce gene inactivation in specific tissues (Metzger et al., Curr. Op. 
Biotech., 10, 470-476 (1999)). Furthermore, the tetracycline-regulated gene 
expression system has been successfully used to control the expression of Cre in 
transgenic mice and thus provides a second system for inducible gene 

30 inactivation using doxycycline as inducer (Saam et al., J. Biol. Chem. 274, 
38071-38082 (1999)). 

In addition to the application of Cre/loxP for gene inactivation by deletion of a 
gene segment this recombination system has been proved to be useful also for a 
35 number of other genomic manipulations in ES cells or mice. These include the 
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conditional activation of transgenes in mice, chromosome engineering to obtain 
deletion, translocation or inversion, the simple removal of selection marker 
genes, gene replacement, the targeted insertion of transgenes and the 
(in)activation of genes by inversion (Nagy, Genesis, 26, 99-109 (2000); Cohen - 
Tannoudji et al., Mol. Hum. Reprod. 4, 929-938 (1998)). In conclusion, the 
Cre/loxP recombination system has been proven to be extremely useful for the 
analysis of gene function in mice by broadening the methodological spectrum for 
genome engineering. It can be expected that many of the protocols now 
established for the mouse may be applied in future also to other animals or 
plants. 

In contrast to the huge diversity of genome manipulations which have been 
developed for the Cre/loxP system, very limited efforts have been made to 
develop further site-specific recombination systems for the use in mammalian 
cells. Alternative recombination systems of different specificity but with an 
efficiency comparable to Cre/loxP could further enhance the flexibility of genome 
engineering by the side to side use of two or more systems in the same cell or 
organism. Furthermore, unidirectional recombination systems which follow a 
different mechanism than the reversible Cre/loxP-mediated recombination should 
allow to develop new applications for genome engineering which cannot be 
performed with the current systems. 

The reasons for the almost exclusive use of the Cre/loxP system for site-specific 
recombination in mammalian cells are readily explained by a number of 
requirements which must be fulfilled for the efficient use of a recombinase in 
mammalian cells: 

i) the recombinase should act independent of cofactors like helper 
proteins, 

ii) it should act independent of the supercoiling status of the target DNA 
and also on mammalian chromatin, 

iii) it should be efficiently active and stable at a temperature of 37°C, and 

iv) it should recognize a target sequence which is sufficiently long to be 
unique among large genomes, and it should exhibit a very high affinity 
to its target site for efficient action (Kilby et al., TIG, 9, 413-421 
(1993)). 
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Among the more than 200 described members of the integrase and 
resolvase/invertase recombinase families only the Cre/loxP system is presently 
known to fulfill all of these requirements (Nunes-Duby et al., Nucleic Acids Res., 
26, 391-406 (1998); Kilby et al., TIG, 9, 413-421 (1993); Ringrose et al., J. Mol. 
Biol., 284, 363 - 384 (1998)). Besides Cre/loxP a few recombinases have been 
shown to exhibit some activity in mammalian' cells but their practical value is 
presently unclear as their efficieny has not been compared to the Cre/loxP 
system on the same genomic recombination substrate and in some cases it is 
known that one or more of the criteria listed above are not met. The best 
characterised examples are the yeast derived FLP and Kw recombinases which 
exhibit a temperature optimum at 30°C but which are unstable at 37°C (Buchholz 
et al., Nature Biotech., 16, 657 - 662 (1998); Ringrose et al., Eur. J. Biochem., 
248, 903 - 912). For FLP it has been shown in addition that its affinity to the FRT 
target site is much lower as compared to the affinity of Cre to loxP sites 
(Ringrose et al., J. Mol. Biol., 284, 363 - 384 (1998)). Other recombinases which 
show in principle some activity in mammalian cells are a mutant integrase of 
phage X, the integrases of phages <BC31 and HK022, mutant Y 5-resolvase and B- 
recombinase (Lorbach et al., J. Mol. Biol., 296, 1175 - 81 (2000); Groth et al., 
Proc. Natl. Acad. Sci. USA, 97, 5995 - 6000 (2000); Kolot et al., Mol. Biol. Rep. 
26, 207 - 213 (1999); Schwikardi et al., FEBS Lett., 471, 147 - 150 (2000); Diaz 
et al., J. Biol. Chem., 274, 6634 - 6640 (1999)). Other phage integrase systems 
include coliphage P4 recombinase, Listeria phage recombinase, bacteriophage R4 
Sre recombinase, CisA recombinase, XisF recombinase and transposon Tn4451 
TnpX recombinase (Stark et al. Trends in Genetics 8, 432-439 (1992); Hatfull 8i 
Gridley, in Genetic Recombination. Eds. Kucherlipati & Smith, Am. Soc. 
Microbiol., Washington DC, 357-396 (1988)). 

However, the practical value of these recombinases and integrases for use in 
mammalian cells is limited as their efficiency to recombine mammalian genomic 
DNA has not. been tested or compared with the Cre/loxP system. From the data 
available it can be assumed that these recombinases are much less effective than 
the Cre/loxP system. 
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In a few cases attempts have been made to improve the performance of 
recombinases in mammalian cells: for FLP a mutant showing improved 
thermostability and acticity at 37°C has been isolated but this mutant is still 
considerably more heat labile as compared to Cre (Buchholz et al., Nature 
5 Biotech., 16, 657 - 662 (1998)). In the case of X-integrase and y5-resolvase the 
absolute requirement for coproteins and supercoiled DNA could be eliminated by 
the introduction of specific point mutations (Schwikardi et al. FEBS Lett 471, 
ppl47-50 (2000)). 

10 The import of cytoplasmic proteins into the nucleus of eucaryotic cells through 
nuclear pores is a regulated, energy dependent process mediated by specific 
receptors (Gorlich et al., Science, 271, 1513 - 1518 (1996)). Proteins which do 
not posses a signal sequence recognised by the nuclear import machinery are 
excluded from the nucleus and remain in the cytoplasm. Numerous of such 

15 nuclear localisation signal sequences (NLS), which share ..a high proportion of 
basic amino acids in common, have been characterised (Boulikas, Crit. Rev. 
Eucar. Gene Expression, 3, 193 - 227 (1993)), the prototype of which is the 7 
amino acid NLS derived from the T-antigen of the SV40 virus (Kalderon et. al, 
Cell, 39, 499 - 509 (1984)). 

20 

It was believed that the fusion of such an NLS peptide to a recombinase possibly 
would enhance the efficiency of the recombinase by mediating its import into the 
nucleus and therewith increasing the concentration of the recombinase inside the 
nucleus. However, for Cre recombinase it has been shown that the addition of 

25 the SV-40 T-antigen NLS does not improve its recombination efficiency in 
mammalian cells (Le et al., Nucleic Acid Res., 27, 4703 -4709 (1999)). 
Nevertheless, both Cre and a Cre-NLS-fusion protein are widely used. Schwikardi 
(Schwikardi et al., FEBS Lett. 471, ppl47-50 (2000)) reported a yS-resolvase-SV- 
40 T-antigen NLS fusion protein, which also did not enhance the recombination 

30 efficiency. 

The level of activity exhibited by recombinases of diverse prokaryotic origin in 
mammalian cells may be the result of the intrinsic properties of an enzyme 
depending on parameters like its temperature optimum, its target site affinity, 
35 protein structure and stability, the degree of cooperativity, the stability of the 
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synaptic complex and the dependence on coproteins or supercoiled DNA. Within 
the specific environment of mammalian cells the activity of a prokaryotic 
recombinase could be limited by additional factors such as a short halMife of the 
recombinase transcript, a short halMife of its protein, its inability to act on 
histone-complexed and higher order structured mammalian genomic DNA, 
exclusion from the nucleus or the recognition of cryptic splice sites within its 
mRNA resulting in a nonfunctional transcript. Due to the lack of information on 
the parameters listed above for almost all recombinases it is presently not 
possible to rationally optimise their performance in mammalian cells. 

Summary of the Invention 

The object to be solved by the Invention of the present application is the 
provision of a recombination system alternative to the Cre/loxP system, which 
has a different specificity but an efficiency comparable to Cre/loxP. Such an 
alternative recombination system is particularly desirable for all those 
applications which require more than one potent recombination system for being 
successfully carried out (e.g. the methods disclosed in PCT/EP01/00060 and 
PCT/EP00/10162). Most surprisingly, it was found that the above object can be 
solved by fusing a signal peptide capable directing the nuclear import 
(hereinafter shortly referred to as nuclear localisation signal sequences (NLS)) to 
specific recombinases. 

In contrast to the wildtype recombinases, the resulting modified recombinases 
allow a highly efficient recombination of extrachromosomal and chromosomal 
DNA in mammalian cells, and a highly efficient excision of extrachromosomal and 
chromosomal DNA-stretches, which are flanked by suitable recognition sites for 
said modified recombinases. 

The present invention thus provides: 

(1) A fusion protein (hereinafter also referred to as "modified recombinase") 
comprising 

(a) a recombinase domain comprising a recombinase protein or fragment thereof 
and 

(b) a signal peptide domain being linked to (a) and directing the nuclear import 
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of said fusion protein in eucaryotic cells, 

preferably the activity of the fusion protein in eucaryotic cells is significantly 
higher as compared to the acitivity of the wildtype recombinase corresponding 
to the recombinase of the recombinase domain; 
, 5 (2) in a preferred embodiment of the fusion protein defined in (1) above, the 
recombinase domain comprises an integrase protein, preferably a phage <DC31 
integrase (C31-Int) protein or a mutant thereof; 

(3) a DNA coding for the fusion protein as defined in (1) or (2) above; 

(4) a vector containing the DNA as defined in (3) above; 

10 (5) a microorganism containing the DNA of (3) above and/or the vector of (4) 
above; 

(6) a process for preparing the fusion protein as defined in (1) or (2) above 
which comprises culturing a microorganism as defined in (5) above; 

(7) the use of the fusion protein as defined in (1) or (2) above to recombine DNA 
15 molecules, which contain recombinase recognition sequences for the 

recombinase protein of the recombinase domain, in eucaryotic cells; 

(8) a cell, preferably a mammalian cell containing the DNA sequence of (3) above 
in its genome; 

(9) the use of the cell of (8) above for studying the function of genes and for the 
20 creation of transgenic organisms; 

(10) a transgenic organism, preferably a transgenic mammal containing the DNA 
sequence of (3) above in its genome; 

(11) the use of the transgenic organism of (10) above for studying gene function 
at various developmental stages; and 

25 (12) a method for recombining DNA molecules of cells or organisms containing 
recognition sequences for the recombinase protein of the recombinase domain as 
defined in (1) or (2) above, which method comprises supplying the cells or 
organisms with a fusion protein as defined in (1) or (2) above, or with a DNA 
sequence of (3) above and/or a vector of (4) above which are capable of 

30 expressing said fusion protein in the cell or organism. 

The present invention combines the use of prokaryotic recombinases such as the 
C31-Int with a eukaryotic signal sequence which increases its efficiency in 
mammalian cells such that it is equal to the widely used Cre/loxP recombination 
35 system. The improved recombination system of the. present invention provides 
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an alternative recombination system for use in mammalian cells and organisms 
which allows to perform the same types of genomic modifications as shown for 
Cre/loxP, including conditional gene inactivation by recombinase-mediated 
deletion, the conditional activation of transgenes in mice, chromosome 
5 engineering to obtain deletion, translocation or inversion, the simple removal of 
selection marker genes, gene replacement, the targeted insertion of transgenes 
and the (in)activation of genes by inversion. 

Short Description of Figures 

10 Rq- 1: C31-Int and Cre recombinase expression vectors and a recombinase 
reporter vector used for transient and stable transfections 

Fig- 2 ' Results of transient transfections of C31 Int and Cre expression vectors 
and reporter vectors into CHO cells. 

15 

Fig- 3: Results of transient transfections of XisA and Ssv recombinase 
expression vectors with and without nuclear localisation signals and reporter 
vectors into CHO cells. 

20 Fig. 4: Results of transient transfections of C31-Int and Cre recombinase vectors 
into a stable reporter cell line. 

Fig- 5: In situ detection of B-galactosidase in 3T3(pRK64)-3 cells transfected with 
recombinase expression vectors 

25 

flq- 6: Test vector for C31-Int mediated deletion, pRK64, and the expected 
deletion product. 

-Fig- 7: PCR products generated with the primers P64-1 and P64-4 and sequence 
30 comparison. 

Eig^JL ROSA26 locus of the C31 reporter mice carrying a C31 reporter construct. 

- F| g- 9: In situ detection of p-galactosidase in a cryosection of the testis of: (A) a 
35 double transgenic mouse carrying both the recombinase and the reporter; and 
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(B) a transgenic mouse carrying only the reporter as a control. 



Detailed Description of the Invention 

The "organisms" according to the present invention are multi-cell organisms and 
can be vertebrates such as mammals (humans and non-human animals including 
rodents such as mice or rats) or non-mammals (e.g. fish), or can be 
invertebrates such as insects or worms, or can be plants (higher plants, algi or 
fungi). Most preferred living organisms are mice and fish. 

"Cells" and "eucaryotic cells" according to the present invention include cells 
isolated from the above defined living organism and cultured in vitro. These cells 
can be transformed (immortalized) or untransformed (directly derived from the 
living organism; primary cell culture). 

"Microorganism" according to the present invention relates to procaryotes (e.g. 
E coli) and eucaryotic microorganisms (e.g. yeasts). 

According to embodiment (1) of the present invention, the activity of the fusion 
protein in eucaryotic cells is significantly higher as compared to the acitivity of 
the wildtype recombinase corresponding to the recombinase of the recombinase 
domain. A "significantly higher activity" in accordance with the present invention 
refers to an increase in activity of at least 50%, preferably at least 75%, more 
preferably at least 100% relative to the corresponding wildtyp recombinase in 
eucaryotic cells. A "significantly higher activty" also implies that the resulting 
fusion protein has at least 25%, preferably at least 50% and more preferably at 
least 75%, of the activity of Cre/loxP in 3T3 cells with a stably integrated target 
sequence. 

Recombinase proteins which can be used in the recombinase domain of the 
fusion protein of the present invention (i.e., giving a fusion having a "significantly 
higher activty" as defined above) include , but are not limited to, a certain type 
of recombinases belonging to the family of of large serine recombinases (Thorpe 
et al., Control of directionalty in the site-specific recombination system of the 
streptomyces phage $C31, Molecular Microbiology 38(2), 232-241 (2000)). This 
family includes bacteriophage <£C31 integrase ("C31-Int"; the amino acid 
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sequence of said integrase and a DNA sequence coding therefor are shown in 
SEQ ID IMOs:21 and 20, respectively), coliphage P4 recombinase, Listeria phage 
recombinase, bacteriophage R4 Sre recombinase ( n R4 Sre" deposited under GI 
793758; the amino acid sequence of said recombinase and a DNA sequence 
5 coding therefor are shown in SEQ ID NOs:55 and 54, respectively), bacillus 
subtilis CisA recombinase ("CisA" deposited under GI 142689; the amino acid 
sequence of said recombinase and a DNA sequence coding therefor are shown in 
SEQ ID NOs:57 and 56, respectively), XisF recombinase from annabaena sp. 
Strain PCC 7120 (Cyanobacterium; "XisF" deposited under GI 349678; the amino 

10 acid sequence of said integrase and a DNA sequence coding therefor are shown 
in SEQ ID NOs:59 and 58, respectively), transposon Tn445J TnpX recombinase 
("TnpX- deposited under GI 551135; the amino acid sequence of said 
recombinase and a DNA sequence coding therefor are shown in SEQ ID NOs:61 
and 60, respectively), "XisA" recombinase from annabaena sp. Strain PCC 7120 

15 (Cyanobacterium; the amino acid sequence of said recombinase and a DNA 
sequence coding therefor are shown in SEQ ID NOs:63 and 62, respectively), 
"SSV" recombinase from phage of sulfolobus shibatae (the amino acid sequence 
of said recombinase and a DNA sequence coding therefor are shown in SEQ ID 
NOs:65 and 64, respectively), lactococcal bacteriophage TP901-1 recombinase 

20 (TP901-1 complete genome deposited under GI 13786531; the amino acid 
sequence of said recombinase and a DNA sequence coding therefor are shown in 
SEQ ID NOs:108 and 107, respectively), and the like, or mutants thereof. Other 
procaryotic recomblnases known in the art are also applicable. 

25 A "mutant" of the above recombinases in accordance with the present invention 
relates to a mutant of the respective original (viz. wild-type) recombinase having 
a recombinase activity similar (e.g. at least about 90%) to that of said wild-type 
recombinase. Mutants include truncated forms of the recombinase (such as N- or 
C-terminal truncated recombinase proteins), deletion-type mutants (where one 

30 or more amino acid residues or segments having more than one continuous 
amino acid residue have been deleted from the primary sequence of the wildtyp 
recombinase), replacement-type mutants (where one or more amino acid 
residues or segments of the primary sequence of the wildtyp recombinase have 
been replaced with alternative amino acid residues or segments), or 

35 combinations thereof. 
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According to embodiment (2) of the invention, the recombinase domain 
comprises an integrase protein, preferably a phage <DC31 integrase (C31-Int) 
protein or a mutant thereof. Thus, the present invention provides a fusion protein 
comprising 

(a) an integrase domain being a C31-Int protein or a mutant thereof, and 

(b) a signal peptide domain being linked to (a) and directing the nuclear import 
of said fusion protein into eucaryotic cells. 

In the fusion protein of embodiment (2), the integrase domain is preferably a 
C31-Int having the amino acid sequence shown in SEQ ID NO:21 or a C-terminal 
truncated form thereof. Suitable truncated forms of the C31-Int comprise amino 
acid residues 306 to 613 of SEQ ID NO:21. 

The signal peptide domain (hereinafter also referred to as "NLS") is preferably 
derived from yeast GAL4, SKI3, L29 or histone H2B proteins, polyoma virus large 
T protein, VP1 or VP2 capsid protein, SV40 VP1 or VP2 capsid protein, Adenovirus 
Ela or DBP protein, influenza virus NS1 protein, hepatitis virus core antigen or 
the mammalian lamin, c-myc, max, c-myb, p53, c-erbA, jun, Tax, steroid 
receptor or Mx proteins (see Boulikas, Crit. Rev. Eucar. Gene Expression, 3, 193 

- 227 (1993)), simian virus 40 ("SV40") T-antigen (Kalderon et. al, Cell, 39, 499 

- 509 (1984)) or other proteins with known nuclear localisation. The NLS is 
preferably derived from the SV40 T-antigen. 

Furthermore, the signal peptide domain preferably has a length of 5 to 74, 
preferably 7 to 15 amino acid residues. More preferred is that the signal peptide 
domain comprises a segment of 6 amino acid residues wherein at least 2 amino 
acid residues, preferably at least 3 amino acid residues are positively charged 
basic amino acids. Basic amino acids include, but are not limited to, Lysin, Arginin 
and Histidine. Particularly preferred signal peptides are show in the following 
table. 

Organism Sequence/(SEQ ID NO:) 

yeast GAL4 MKxllCRLKKLKCSKEKPKCAKCLKx5Rx3KTKR (24) 

yeast SKI3 IKYFKKFPKD (25) 
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yeast L29 


MTGS KTRKH RGSG A 


(26) 




(MTGSKHRKHPGSGA) 


(27) 


yeast histone H2B 


(GKKRSKA) 


(28) 


polyoma virus large T protein 


(PKKAREDVSRKRPR) 


(29) 


polyoma virus VP1 capsid protein 


(APKRKSGVSKC) 


(30) 


polyoma virus VP2 capsid protein 


(EEDGPQKKKRRL) 


(31) 


SV40 VP1 capsid protein 


(APTKRKGS) 


(32) 


SV40 VP2 capsid protein, 


(PNKKKRK) 


(33) 


Adenovirus Ela protein 


(KRPRP) 


(34) 




(CGGLSSKRPRP) 


(35) 


Adenovirus DBP protein 


(PPKKRMRRRIEPKKKKKRP) 


(36) 


influenza virus NS1 protein 


(PFLDRLRRDQK) 


(37) 




(PKQKRKMAR) 


(38) 


human laminA 


(SVTKKRKLE) 


(39) 


human c-myc 


(CGGAAKRVKLD) 


(40) 




(PAAKRVKLD) 


(41) 




(RQRRNELKRSP) 


(42) 


HUMAN max 


(PQSRKKLR) 


(43) 


HUMAN c-myb 


(PLLKKIKQ) 


(44) 


HUMAN p53 


(PQPKKKP) 


(45) 


HUMAN c-erbA 


(SKRVAKRKL) 


(46) 


VIRAL jun 


(ASKSRKRKL) 


(47) 


HUMAN Tax 


(GGLCSARLHRHALLAT) 


(48) 


Mammalian glucocorticoid receptor 


(RKTKKKIK) 


(49) 


HUMAN ANDROGEN RECEPTOR 


(RKLKKLGN) 


(50) 


MAMMALIAN ESTROGEN RECEPTOR 


(RKDRRGGR) 


(51) 


Mx proteins 


(DTREKKKFLKRRLLRLDE) 


(52) 


SV40 T-antigen 


(PKKKRKV) 


(53) 



The most preferred signal peptide domain is that of SV40 T-antigen having the 
sequence Pro-Lys-Lys-Lys-Arg-Lys-Val. 



The signal peptide domain may be linked to the N-terminal or C-terminal of the 
integrase domain or may be integrated into the integrase domain, preferably the 
signal peptide domain is linked to the C-terminal of the integrase domain. With 
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regard to phage <DC31 integrase protein of embodiment (2) of the invention it 
was found that the fusion of an NLS-peptide to the C-terminus of the integrase 
provided a much higher increase of activity as compared to the fusion of the 
same NLS-peptide to the N-terminus of the integrase (see Example 1, figures 3 
and 4). 

According to the present invention, the signal peptide domain may be linked to 
the integrase domain directly or through a linker peptide. Suitable linkers include 
peptides having from 1 to 30, preferably 1 to 15 amino acid residues, said amino 
acid residues being essentially neutral amino acids such as Gly, Ala and Val. 

The most preferred fusion protein of the present invention comprises the amino 
acid sequence shown in SEQ ID NO:23 (a suitable DNA sequence coding for said 
fusion protein being shown in SEQ ID NO:22). 

Further preferred fusion proteins of the present invention are "NLS-XisA" and 
"NLS-SSV" (having the NLS-peptide fused to the N-terminus of the 
recombinases) as shown in SEQ ID NO:67 and 69, respectively (suitable DNA 
sequences coding for said fusion proteins being shown in SEQ ID NO: 66 and 68, 
respectively). 

In embodiments (7), (8), (10) and (12) of the invention the DNA molecules, the 
cell or - transgenic organism may also contain recognition sequences for the 
recombinase protein of the recombinase domain. Thus, when utilizing the fusion 
protein of embodiment (2), the C31-Int recognition sequences attP and attB are 
present in DNA molecules, the cell or transgenic organism. 

The term "mammal" as used in embodiment (10) of the invention includes non- 
human mammals (viz. animals as defined above) and humans (if such subject 
matter is patentable with the respective patent authority). 

Since the modified recombinase of the invention, In particular the modified C31- 
Int, acts in mammalian cells as efficient (or at least almost as efficient) as the 
widely used Cre/loxP system it can be used for a large variety of genomic 
modifications (including the methods disclosed in PCT/EPO 1/00060 and 
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PCT/EP00/10162, the content of which is herewith incorporated by reference). 
Concerning embodiment (11) it is to be noted that the mammals of embodiment 
(10) can be used to study the function of genes, e.g. in mice, by conditional gene 
targeting. For this purpose suitable recognition sequences - when utilizing the 
fusion protein of embodiment (2), one attP and one attB site (C31-Int recognition 
sequences) in the same orientation - can be introduced into introns of a gene by 
homologous recombination of a gene targeting vector in ES cells such that the 
two sites flank one or more exons of the gene to be studied but do not interfere 
with gene expression. A selection marker gene, needed to isolate recombinant ES 
cell clones, can be flanked by two recognition sites of another recombinase such 
as loxP or FRT sites to enable deletion of the marker gene upon transient 
expression of the respective recombinase in ES cells. These ES cells can be used 
to generate germline chimaeric mice which transmit the target gene modified by 
att sites to their offspring and allow to establish a modified mouse strain. The 
crossing of this strain with a C31-Int recombinase transgenic line or the 
application of C31-Int protein will result in the deletion of the att-flanked gene 
segment from the genome of doubly transgenic offspring and the inactivation of 
the target gene in doubly transgenic offspring in a prespecified temporally and/or 
spatially restricted manner. The C31-Int transgenic strain contains a transgene 
whose expression is either constitutively active in certain cells and tissues or is 
inducible by external agents, depending on the promoter region used. If an attB 
and an attP site are placed into the genome in opposite orientation C31-Int 
mediated recombination results in the irreversible inversion of the flanked gene 
segment leading the functional loss of on or more exons of the target gene. 
Thus, the method allows the analysis of gene function in particular cell types and 
tissues of otherwise widely expressed genes and circumvents embryonic lethality 
which is often the consequence of complete (germline) gene inactivation. For the 
validation of genes and their products for drug development, gene inactivation 
which is inducible in adults provides an excellent genetic tool as this mimicks the 
biological effects of target inhibition upon drug application. If a pair of attB/P 
sites is placed in the same or opposite orientation into a chromosome at large 
distance using two gene targeting vectors, C31-Int recombination allows to 
delete or invert chromosome segments containing one or more genes, or 
chromosomal translocations if the two sites are located on different 
chromosomes. In another application of the method a pair of attB/P sites is 
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placed in the same orientation within a transgene such that the deletion of the 
att-flanked DNA segment results in gene expression, e.g. of a toxin or reporter 
gene for cell lineage studies, or in the inactivation of the transgene. 

5 In addition, according with embodiment (12) of the invention, the recombination 
system of embodiment (1), in particular the C31-Int recombination system of 
embodiment (2), can also be used for the site specific integration of foreign DNA 
into the genome of mammalian cells, e.g. for gene therapy. For this purpose, and 
if the C31-Int recombination system of embodiment (2) is utilized, only one attB 
10 (or attP) site is initially introduced into the genome by homologous 
recombination, or an endogenous genomic sequence which resembles attB or 
attP is used . The application of a vector containing an attP (or attB) site to such 
cells or mice in conjunction with the expression of C31-Int recombinase will lead 
to the site specific integration of the vector into the genomic att site. 

15 

Thus, the present invention provides a process which enables the highly efficient 
modification of the genome of mammalian cells by site-specific recombination. 
Said process possesses the following advantages over current technology: 

20 (i) the modified recombinase, in particular the modified C31-Integrase, allows 
to recombine extrachromosomal and genomic DNA in mammalian cells at 
much higher efficiency as compared to the use of its wildtype form; 

(ii) the modified recombinase, in particular the modified C31-Integrase, is the 
25 first described alternative recombination system with equal efficiency to 

Cre/loxP for the recombination of chromosomal DNA in mammalian cells. 

The appended figures further explain the present invention: 

30 Figure 1 shows C31-Int and Cre recombinase expression vectors and a 
recombinase reporter vector used for transient and stable transfections. 
A-D: Mammalian expression vectors for recombinases which contain the CMV 
immediate early promoter followed by a hybrid intron, the coding region of the 
recombinase to be tested, and an artificial polyadenylation signal sequence (pA). 
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A: pCMV-C31Int(wt) containing the nonmodified (wildtype) 1.85 kb coding region 
of C31-Int as found in the genome of phage <KX31. 

B: pCMV-C31Int(NNLS) containing a modified C31-Int gene coding for the full 
length C31-Int protein with a N-terminal fusion to the SV40 virus large T antigen 
5 nuclear localisation signal (NLS). 

C: pCMV-C31Int(CNLS) containing a modified C31-Int gene coding for the full 
length C31-Int protein with a C-terminal fusion to the SV40 virus large T antigen 
nuclear localisation signal (NLS). 

D: pCMV-Cre contains the 1.1 kb Cre coding region with an N-terminal fusion to 

10 the SV40 T antigen NLS. 

E: Recombination substrate vector pRK64 contains a SV40 promoter region 
followed by a 1.1 kb cassette consisting of the coding region of the puromycin 
resistance gene and a polyadenylation signal sequence, flanked 5' by the 84 bp 
attB and 3 ' by the 84 bp attP recognition site of C31-Int. pRK64 contains in 

15 addition two Cre recognition (loxP) sites in direct orientation next to the att sites. 

Figure 2 shows results of transient transfections of C31-Int and Cre recombinase 
and reporter vectors Into CHO cells. 

All transfections were performed with a fixed amount of the reporter plasmid 
20 pRK64 and 0.5 ng or 1 ng of the recombinase expression plasmids pCMV-C31- 
Int(wt) (samples 4-5), pCMV-C31-Int(NNLS) (samples 6-7), pCMV-C31- 
Int(CNLS) (samples 8-9) or pCMV-Cre (samples 10-11). Negative controls: 
transfection with pRK64 (sample 3) or pUC19 alone (sample 1). Positive control: 
transfection with the Cre-recombined reporter pRK64(ACre) (sample 2). 
25 The vertical rows show the mean values and standard deviation of "Relative Light 
Units" obtained from lysates with the assay for B-galactosidase (RLU (B-Gal)), 
the RLU from the assay for Luciferase, the ratio of the 8-galactosidase and 
Luciferase values with standard deviation (RLU x 10 s (Gal/Luc)), and the relative 
activity of the various recombinases as compared to the positive control defined 
30 as 1. 

Figure 3 shows results of transient transfections of XisA and Ssv recombinases 
and reporter vectors into CHO cells. 

All transfections were performed with fixed amounts of the reporter plasmids 
35 pPGKnif (for XisA) and pPGKattA (for SSV) and 25 ng or 100 ng of the 
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recombinase expression plasmids pCMV-XisA, pCMV-XisA(NNLS) and 10 ng or 20 
ng of the expression plasmids pCMV-Ssv and pCMV-Ssv(NNLS). Negative 
controls: transfection with pPGKnif or pPGKattA alone. 

The vertical rows show the mean values and standard deviation of "Relative Light 
5 Units" obtained from lysates with the assay for B-galactosidase (RLU (B-Gal)), 
the RLU from the assay for Luciferase, the ratio of the B-galactosidase and 
"Luciferase" values with standard deviation (RLU x 10 s (Gal/Luc)). 

Figure 4 shows results of transient transfections of recombinase vectors into a 

10 stable reporter cell line. 

All transfections were performed with a NIH 3T3 derived clone containing stably 
integrated copies of the pRK64 recombination substrate vector. Either 32 ng or 
64 ng of the recombinase expression plasmids pCMV-C31-Int(wt) (samples 2-3), 
pCMV-C31-Int(NNLS) (samples 4-5), pCMV-C31-Int(CNLS) (samples 6-7) or 

15 pCMV-Cre(NNLS) (samples 8-9). Negative control: transfection with pUC19 alone 
(sample 1). 

The vertical rows show the mean values and standard deviation of "Relative Light 
Units" obtained from lysates with the assay for B-galactosidase (RLU (B-Gal)) and 
the relative activity of the various recombinases as compared to the value 
20 obtained with pCMV-Cre(NNLS) defined as 1. 

Figure 5 shows the in situ detection of B-galactosidase in 3T3(pRK64)-3 cells 
transfected with recombinase expression vectors. 

The Cre and C31-Int recombinase reporter cell line 3T3(pRK64)-3 was either not 
25 transfected with DNA (A), transfected with the Cre expression vector pCMV-Cre 
(B) or with the C31-Int expression vector pCMV-C31-Int(CNLS). Two days after 
tranfection the cells were fixed and incubated with the histochemical X-Gal assay 
which develops a blue stain in B-galactosidase positive cells indicating 
recombinase mediated activation of the reporter gene. 

30 

Figure 6 shows the test vector for C31-Int mediated deletion, pRK64, and the 
expected product of deletion, pRK64(AInt). 

Plasmid pRK64 contains the 1.1 kb cassette of the coding region of the 
puromycin resistance gene and a polyadenylation signal, which is flanked 5* by 
35 the 84 bp attB and 3 ' by the 84 bp attP recognition site (large triangles) of C31- 
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Int. These attB and attP sites are oriented in the same way to each other (thick 
black arrows) which is used by the *X31 phage to integrate into the bacterial 
genome. In addition, the cassette is flanked by two Cre recombinase recognition 
(loxP) sites in the same orientation (black small triangles). For better orientation 
5 the half sites of the att sequences are labelled by a direction (thin arrow) and 
numbered 1-4. The 3 bp sequence within the att sites at which recombination 
occurs is framed by a box. The positions at which the PCR primers P64-1 and 
P64-4 hybridise to the pRK64 vector are indicated by arrows, pointing into the 
3' direction of both oligonucleotides. 
10 PRK64(AInt) depicts the deletion product expected from the C31-Int mediated 
recombination between the att sites of pRK64. The recombination between a pair 
of attB/attP sites generates an attR site remaining on the parental DNA molecule 
while the puromycln cassette is excised. In this configuration the primers P64-1 
and P64-4 will amplify a PCR product of 630 bp from pRK64(AInt). 

15 

ftqure 7 shows PCR products generated with the primers P64-1 and P64-4 and a 
sequence comparison of the PCR product. 

A: Analysis of PCR products on an agarose gel from PCR reactions using the 
Primers P64-1 and P64-4 on DNA extracted from MEF5-5 cells transfected 2 days 

20 before with plasmid pRK64 alone (lane 4), with pRK64 + CMV-Cre (lane 3), with 
pRK64 + pCMV-C31-Int(wt) (lane 2), and from a control reaction which did not 
contain cellular DNA (lane 1). The product with an apparent size around 650 bp, 
as compared to the size marker used, from lane 2 was excised from the agarose 
gel and purified. The PCR product was cloned into a sequencing plasmid vector 

25 and gave rise to the plasmid pRK80d. The insert of this plasmid was sequenced 
using reverse primer (seqSOd) and compared to the predicted sequence of the 
PRK64 vector after C31-Int mediated deletion of the att flanked cassette, 
pRK64(AInt). The cloned PCR product shows a 100% identity with the predicted 
attR sequence after deletion. The generated attR site is shown in a box, with the 

30 same sequence designation used in Figure 5. The nucleotide positions (pos.) of 
the compared sequences pRK64(Alnt) and Seq80d are indicated. 

- F '9 ure 8 shows the modified ROSA26 locus of C31 reporter mice (Seq ID 
NO:106). A recombination substrate has been inserted in the ROSA26 locus. The 
35 substate consists of a splice acceptor (SA) followed by a cassette consisting of 
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the hygromycin resistance gene driven by a PGK promoter and flanked by the 
recombination sites attB and attP. In addition the reporter contains two Cre 
recognition sites (loxP) in direct orientation next to the att sites. This cassette is 
followed by the coding region for p-galactosidase, which is only expressed when 
the hygromycin resistance gene has been deleted by recombination. 

Egure_9 shows the in situ detection of p-galactosidase activity. A cryosection of 
the testis of a double transgenic mouse carrying both the C31-int recombinase 
and the recombination substrate was stained with X-Gal (A). The blue colour 
indicates recombination of the substrate, which leads to the expression of B- 
galactosidase. As a control a cryosection of testis of a transgenic mouse carrying 
only the recombination substrate was stained with X-Gal (B). 

The present invention is further illustrated by the following Examples which are, 
however, not to be construed as to limit the invention. 

Examples 
Example 1 

As compared to Cre recombinase the wildtype form of C31-Int exhibits a 
significantly lower recombination activity in mammalian cells which falls in the 
range of 10 - 40% of Cre, depending on the assay system used (see below). As 
a measure which may increase C31-Int efficiency in eukaryotic cells we designed 
mammalian expression vectors for N- or C-terminal fusion proteins of C31-Int 
with a peptide was designed which is recognised by the nuclear import 
machinery. The recombination efficiency obtained by this modified C31-Int 
recombinase in mammalian cells was compared side by side to the unmodified 
(wildtype) form of C31-Int and to Cre recombinase. For the quantification of 
recombinase activities the expression vectors were transiently introduced into a 
mammalian cell line together with a reporter vector which contains C31-Int and 
Cre target sites and leads to the expression of B-galactosidase upon recombinase 
mediated deletion of a vector segment flanked by recombinase recognition sites. 

A. Plasmiri construction*; • 
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Construction of the recombination test vectors pPGKnif and pPGKattA: first a nifD 
site (Haselkorn, Annu Rev.Genet. 26, 113-130 (1992)) generated by the 
annealing of the two synthetic oligonucleotides nifD3 (SEQ ID NO:89) and nifD4 
(SEQ ID IMO:90), was ligated into the BamHI restriction site of the vector PSV- 
5 Paxl (Buchholz et al., Nucleic Acids Res., 24, 4256-4262 (1996)), 3' of its 
puromycin resistance gene and loxP site, giving rise to plasmid pPGKnifD3' (SEQ 
ID NO: 79). Next, another nifD site, generated by the annealing of the two 
synthetic oligonucleotides riifDl (SEQ ID NO:87) and nifD2 (SEQ ID NO:88), was 
ligated into the BstBI restriction site of plasmid pPGKnifD3', upstream of the 

10 puromycin resistance gene and loxP site, giving rise to plasmid pPGKnifD (SEQ ID 
NO:78). For pPGKattA (Muskhelishvili et al., Mol.Gen.Genet. 237, 334-342 
(1993)) first a 352bp-fragment was amplified from genomic DNA from the 
thermophilic bacterium Sulfolobus shibatae (DSM-5389, DSMZ Braunschweig- 
Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH, Mascheroder 

15 Weg lb, D-38124 Braunschweig, Germany) with oligonucleotides SSV5 (SEQ ID 
NO.-96) and SSV6 (SEQ ID NO:97) including restriction sites for BamHI and 
BstBI. The amplified fragment was cloned into the BamHI site of the vector PSV- 
Paxl giving rise to plasmid pPGKattAl (SEQ ID NO:82), subsequently the same 
352 bp-fragment was cloned into the BstBI site of pPGKattAl giving rise to the 

20 plasmid pPGKattA2 (SEQ ID NO:83). The sequence and orientation of both nifD 
sites and attA sites was confirmed by DNA sequence analysis. In 
pPGKnifD/pPGKattA2 the newly cloned nifD/attA sites (positions 535-619 and 
1722-1787/ positions 6718-7081 and 12-363) are in the same orientation 
flanking the puromycin resistance gene and the SV40 early polyadenylation 

25 sequence. The nifD/attA sites are followed by loxP sites in the same orientation 
(positions 623 - 656 and 1794 - 1827/ positions 7085-7118 and 369-402). The 
puromycin cassette is transcribed from the SV40 early enhancer/promoter region 
and followed by the coding region for E. coli 6-galactosidase and the SV40 late 
region polyadenylation sequence. 

30 

Construction of XisA and SSV expression vectors: First the XisA gene of 
cyanobacterium PCC7120 was amplified by PCR from genomic DNA from Nostoc 
strain PCC7120 (CNCM-Collection Nationale de Cultures de Microorganismes, 
Institut Pasteur, Paris) using the primers XisAl (SEQ ID NO:84) and XisA3 (SEQ 
35 ID NO:86), and XisAl (SEQ ID NO:84) and XisA2 (SEQ ID NO:85) (with NLS). 
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The ends of the PCR product were digested with NotI and the product was ligated 
into plasmid pBluescript II KS, opened with NotI, giving rise to plasmids pRK42a 
and pRK43 (with NNLS). The DNA sequence of the insert was determined and 
found to be identical to the published XisA sequence (Genbank GI:3953452) 
5 apart from four silent point mutations. The XisA gene was isolated as a 1.4 kb 
fragment from pRK42a and pRK43 by digestion with NotI and ligated into the 
generic mammalian expression vector pRK50 (see below), opened with NotI, 
giving rise to the XisA expression vectors pCMV-XisA (SEQ ID NO:76) and pCMV- 
XisA(NNLS) (SEQ ID NO:77). pCMV-XisA(wt) contains a Cytomegalovirus 
10 immediated early gene promoter (position 1 - 616), a 240 bp hybrid intron 
(position 716 - 953), the XisA gene (position 974 - 2392), and a synthetic 
polyadenylation sequence (position 2413 - 2591). 

The SSV gene was amplified from genomic DNA from the thermophilic bacterium 
Sulfolobus shibatae (DSM-5389, DSMZ Braunschweig- Deutsche Sammlung von 

15 Mikroorganismen und Zellkulturen GmbH, Mascheroder Weg lb, D-38124 
Braunschweig, Germany) in two PCR steps because of an internal attP sequence. 
First, two overlapping PCR fragments were created with the oligonucleotides 
SSV1-1 (SEQ ID NO:91) (or SSV1-2 for the SSV(NNLS) gene) and SSV2 (SEQ ID 
NO:93) and oligonucleotides SSV3 (SEQ ID NO:94) and SSV4 (SEQ ID NO:95). 

20 Using these overlapping fragments as template, a lOOObp fragment containing 
the complete SSV coding sequence was amplified with primers SSV1-1 (or SSV1- 
2 for the SSV(NNLS) gene) and SSV4. The 5' 620 bp-fragments of these PCR 
products were isolated by digestion with Notl-Xhol and cloned into vector 
pBluescript II KS giving rise to plasmids pRK47 and pRK48 (with NLS). The 3' 

25 380 bp fragment generated by Xhol-digestion was cloned into the Xhol 
restriction site of vector pBluescript II KS giving rise to the plasmid pBS-SSVs 
(SEQ ID NO: 72). The 380bp SSV-fragment was then isolated by digestion of 
pBS-SSVs with Xhol and ligated into pRK47 and pRK48 opened by Xhol giving 
rise to plasmids pBS-SSV3 (SEQ ID NO:70) and pBS-SSV4 (SEQ ID NO:71) (with 

30 NLS) containing the complete SSV gene. Sequencing of the plasmids confirmed 
one point mutation in both plasmids. Therefore 312 bp/ 91 bp fragments 
generated by digestion with EcoRV-Smal/ EcoRV-XhoI of another clone of pRK47 
were exchanged in plasmids pBS-SSV3/ pBS-SSV4. Sequences were confirmed 
by sequencing. The SSV gene was isolated from pRK47 and pRK48 by digestion 

35 with NotI and Kpnl and ligated into the generic mammalian expression vector 
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pRK50 (see below), opened with NotI and Sail, giving rise to the SSV expression 
vectors pCMV-SSV(wt) (SEQ ID NO:74) and pCMV-SSV(NNLS) (SEQ ID NO:75). 

Construction of the recombination test vector pRK64: first an attB site (Thorpe et 
5 al. Proc. Natl. Acad. Sci. USA, 95, 5505 - 5510 (1998)), generated by the 
annealing of the two synthetic oligonucleotides C31-4 (SEQ ID NO:l) and C31-5 
(SEQ ID NO:2), was ligated into the BstBI restriction site of the vector PSV-Paxl 
(Buchholz et al., Nucleic Acids Res., 24, 4256-4262 (1996)), 5* of its puromycin 
resistance gene and loxP site, giving rise to plasmid pRK52. The sequence and 

10 orientation of the cloned attB site was confirmed by DNA sequence analysis. 
Next, an attP site site (Thorpe et al. Proc. Natl. Acad. Sci. USA, 95, 5505 - 5510 
(1998)), generated by the annealing of the two synthetic oligonucleotides C31-6 
(SEQ ID NO:3) and C31-7-2 (SEQ ID N0:4), was ligated into the BamHI 
restriction site of plasmid pRK52, downstream of the puromycin resistance gene 

15 and loxP site, giving rise to plasmid pRK64 (SEQ ID NO:5). The sequence and 
orientation of the attP site was confirmed by DNA sequence analysis. In pRK64 
the newly cloned attB (position 348 - 431) and attP (position 1534 - 1617) sites 
are in the same orientation flanking the puromycin resistance gene and the SV40 
early polyadenylation sequence. The attB and attP sites are followed by loxP sites 

20 in the same orientation (positions 435 - 469 and 1624 - 1658). The puromycin 
cassette is transcribed from the SV40 early enhancer/promoter region and 
followed by the coding region for E. coli B-galactosidase and the SV40 late region 
polyadenylation sequence. 

25 Construction of C31-Int expression vectors: First the C31-Int gene of phage 
<J>C31 was amplified by PCR from phage DNA (DSM-49156, DSMZ-Deutsche 
Sammlung von Mikroorganismen und Zellkulturen GmbH, Mascheroder Weg lb, 
D-38124 Braunschweig, Germany) using the primers C31-1 (SEQ ID NO:6) and 
C31-3 (SEQ ID NO:7). The ends of the PCR product were digested with NotI and 

30 the product was ligated into plasmid pBluescript II KS, opened with NotI, giving 
rise to plasmid pRK40. The DNA sequence of the 1.85 kb insert was determined 
and found to be identical to the published C31-Int gene (Kuhstoss et al., J. Mol. 
Biol. 222, 897-908 (1991)), except for an error in the stop codon. This error was 
repaired by PCR amplification of a 300 bp fragment from plasmid pRK40 using 

35 the primers C31-8 (SEQ ID NO:8) and C31-9 (SEQ ID NO:9), which provide a 
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corrected Stop codon. The ends of this PCR fragment were digested with 
Eco47III and Xhol, the fragment was iigated into plasmid pRK40 and opened 
with Eco47III and Xhol to remove the fragment containing the defective stop 
codon. The resulting plasmid pRK55 contains the correct C31-Int gene as 
confirmed by DNA sequence analysis. 

The C31-Int gene was isolated from pRK55 as 1.85 kb fragment by digestion with 
NotI and Xhol and Iigated into the generic mammalian expression vector pRK50 
(see below), opened with NotI and Xhol, giving rise to the C31-Int expression 
vector pCMV-C31-Int(wt). pCMV-C31-Int(wt) (SEQ ID NO: 10) contains a 700 bp 
cytomegalovirus immediated early gene promoter (position 1 - 700), a 270 bp 
hybrid intron (position 701 - 970), the C31-Int gene (position 978 - 2819), and 
a 189 bp synthetic polyadenylation sequence (position 2831 - 3020). 
For the construction of pCMV-C31-Int(NNLS) a 1.5 kb fragment was amplified by 
PCR from phage DNA using oligonucleotides C31-2 (SEQ ID NO:98) and C31-3 
(SEQ ID NO:7). The ends of the PCR product were digested with NotI and the 
product was Iigated into plasmid pBluescript II KS, opened with NotI, giving rise 
to plasmid pRK41 (SEQ ID NO: 99). A 1100 bp fragment generated by digestion 
of plasmid pRK41 with Ncol and NotI was then Iigated into plasmid pRK55 (SEQ 
ID NO:80), opened with Ncol and NotI, giving rise to the plasmid pRK63 (SEQ ID 
NO.-81). The C31-Int gene with N-terminal NLS was isolated as a 1.8 kb fragment 
from pRK63 by digestion with NotI and Xhol and Iigated into the mammalian 
expression vector pRK50, opened with NotI and Xhol, giving rise to the C31-Int 
expression vector pCMV-C31-Int(NNLS). pCMV-C31-Int(NNLS) (SEQ ID NO:73) 
contains a 700 bp Cytomegalovirus immediated early gene promoter (position 1 
- 700), a 270 bp hybrid intron (position 701 - 970), the C31-Int gene with N- 
terminal NLS (position 976 - 2838), and a 189 bp synthetic polyadenylation 
sequence (position 2851 - 3040). 

For the construction of pCMV-C31-Int(CNLS), the 3* -end of the C31-Int gene 
was amplified from pCMV-C31-Int(wt) as a 300 bp PCR fragment using the 
primers C31-8 (SEQ ID NO:8) and C31-2r2 (SEQ ID NO: 11). Primer C31-2-2 
modifies the 3'-end of the wildtype C31-Int gene such that the stop codon is 
replaced by a sequence of 21 basepairs coding for the SV40 T-antigen nuclear 
localisation sequence of 7 amino acids (Prolin-Lysin-Lysin-Lysin-Arginin-Lysin- 
Valin) (Kalderon et. al, Cell, 39, 499 - 509 (1984)), followed by a new stop 
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codon. The ends of this 300 bp PCR fragment were digested with with Eco47III 
and Xhol, the fragment was ligated into plasmid pCMV-C31-Int(wt) and opened 
with Eco47III and Xhol to replace the 3 '-end of the wildtype C31-Int gene 
resulting in the plasmid pCMV-C31-Int(CNLS). The identity of the new gene 
segment was verified by DNA sequence analysis. pCMV-C31-Int(CNLS) (SEQ ID 
NO: 12) contains a 700 bp cytomegalovirus immediated early gene promoter 
(position 12 - 711), a 270 bp hybrid intron (position 712 - 981), the modified 
C31-Int gene (position 989 - 2851), and a 189 bp synthetic polyadenylation 
sequence (position 2854 - 3043). 

To generate the Cre expression plasmid pCMV-Cre (SEQ ID NO: 13), the coding 
sequence of Cre recombinase (Sternberg et al., J. Mol. Biol., 187, 197 - 212 
(1986)) with a N-terminal fusion of the 7 amino acid SV40 T-antigen NLS (see 
above) was recovered from plasmid pgk-Cre and cloned into the NotI and Xhol 
sites of plasmid pRK50. PRK50 (SEQ ID NO: 14) is a generic expression vector for 
mammalian cells based on the cloning vector pNEB193 (New England Biolabs Inc, 
Beverly, MA, USA). PRK50 was built by insertion into pNEB193 of a 700 bp 
cytomegalovirus immediated early gene (CMV-IE) promoter (position 1-700) 
from plasmid pIREShyg (GenBank#U89672; Clontech Laboratories Inc, Palo Alto, 
CA, USA), a synthetic 270 bp hybrid intron (position 701-970), consisting of a 
adenovirus derived splice donor and an IgG derived splice acceptor sequence 
(Choi et al., Mol. Cell. Biol., 11, 3070 - 3074 (1991)), and a 189 bp synthetic 
polyadenylation sequence (position 1000-1188) build from the polyadenylation 
consensus sequence and 4 MAZ polymerase pause sites (Levitt et al., 
Genes&Dev., 3, 1019 - 1025 (1989); The EMBO 3. 13, 5656 - 5667 (1994)). The 
positive control plasmid pRK64(ACre) (SEQ ID NO: 15) was generated from 
PRK64 by transformation into the Cre expressing E. coli strain 294-Cre (Buchholz 
et al., Nucleic Acids Res., 24, 3118 - 3119 (1996)). 

One of the transformed subclones was confirmed for the Cre mediated deletion of 
the loxP-flanked cassette by restriction mapping and further expanded. Plasmid 
PUC19 is a cloning vector without eukaryotic control elements used to equalise 
DNA amounts for transfections (GenBank#X02514; New England Biolabs Inc, 
Beverly, MA, USA). All plasmids were propagated, in DH5a E. coli cells (Life 
Technologies GmbH, Karlsruhe, Germany) grown in Luria-Bertani medium and 
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purified with the plasmid DNA purification reagents "Plasmid-Maxi-Kit" (Quiagen 
GmbH, Hilden, Germany) or "Concert high purity plasmid purification system" 
(Life Technologies GmbH, Karlsruhe, Germany). Following purification, the 
plasmid DNA concentrations were determined by absorption at 260 nm and 280 
nm in UVette cuvettes (Eppendorf-Netheler-Hinz GmbH, Hamburg, Germany) 
using a BioPhotometer (Eppendorf-Netheler-Hinz GmbH, Hamburg, Germany) 
and the plasmids were diluted to the same concentration; finally these were 
confirmed by separation of 10 ng of each plasmid on an ethidiumbromide-stained 
agarose gel. 

B. Cell culture and transfections: Chinese hamster ovary (CHO) cells (Puck et al., 
J. Exp. Med., 108, 945 (1958)) were obtained from the Institute for Genetics 
(University of Cologne, Germany) as a population adapted to growth in DMEM 
medium. The cells were grown in DMEM/Glutamax medium (Life Technologies) 
supplemented with 10% fetal calf serum at 37°C, 10% C0 2 in humid atmosphere 
and passaged upon trypsin isation. One day before transfection 10 6 cells were 
plated into a 48-well plate (Falcon). For the transient transfection of cells with 
plasmids each well received into 250 ml of medium a total amount of 300 ng 
supercoiled plasmid DNA complexed before with the FuGene6 transfection 
reagent (Roche Diagnostics GmbH, Mannheim, Germany) according to the 
manufacturers protocol. Each 300 ng DNA preparation (Fig.2 sample 4 to 11) 
contained 50 ng of the luciferase expression vector pUHC13-l (Gossen et al., 
Proc Natl Acad Sci USA., 89 5547-5551 (1992)), 50 ng of the substrate vector 
pRK64, 0.5 ng or 1 ng of one of the recombinase expression vectors pCMV- 
C31Int(wt), pCMV-C31Int(NNLS), pCMV-C31Int(CNLS) or pCMV-Cre and 199 ng 
or 199.5 ng of pUC19 plasmid, except for the controls which received 50 ng of 
PUHC13-1 together with 50 ng of pRK64 (sample 3) or pRK64(Acre) (sample 2) 
and 200 ng pUC19, or 50 ng pUHC13-l with 250 ng pUC19 (sample 1). 
Transfections of Ssv and XisA recombinases (Fig.3) also contained 50 ng of the 
luciferase expression vector pUHC13-l, 50 ng of substrate vectors pPGKattA and 
pPGKnif and 10 ng or 20 ng of recombinase expression vector pCMV-SSV or 
pCMV-SSV(NNLS) or 25 ng or 100 ng of expression vectors pCMV-XisA/ pCMV- 
XisA(NNLS). Plasmid pUC19 was added to a total amount of 300 ng plasmid DNA. 
As the C31-Int expression vectors are 15% larger in size than pCMV-Cre and the 
same amounts of DNA of the three plasmids were used for transfection, the 
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samples with C31-Int vectors received 15% less plasmid molecules as compared 
to the samples with Cre expression vector. The B-galactosidase values from C31- 
Int transfected samples by this value were not corrected and thus is a slight 
underestimation of the calculated C31-Int activities. For each sample to be tested 
four individual wells were transfected. One day after the addition of the DNA 
preparations each well received additional 250 ml of growth medium. The cells of 
each well were lysed 48 hours after transfection with 100 ml lysate reagent 
supplemented with protease inhibitors (Roche Diagnostics). The lysates were 
centrifuged and 20 ml were used to determine the B-galactosidase activities 
using the B-galactosidase reporter gene assay (Roche Diagnostics) according to 
the manufacturers protocol in a Lumat LB 9507 luminometer (Berthold). To 
measure luciferase activity, 20ml lysate was diluted into 250ml assay buffer 
(50mM glycylglycin, 5mM MgCI 2 , 5mM ATP) and the "Relative Light Units" (RLU) 
were counted in a Lumat LB 9507 luminometer after addition of 100 ml of a 1 
mM luciferin (Roche Diagnostics) solution. The mean value and standard 
deviation of the samples was calculated from the B-galactosidase and luciferase 
RLU values obtained from the four transfected wells of each sample. 

C. Results: To set up an assay system for the measurement of C31-Int and Cre 
recombinase efficiency in mammalian cells the recombination substrate vector 
pRK64 shown in Figure IE was first constructed. pRK64 contains a SV40 
promoter region for expression in mammalian cells followed by a 1.1 kb cassette 
which consists of the coding region of the puromycin resistance gene and a 
polyadenylation signal sequence. This cassette is flanked at the 5* -end by the 84 
bp attB and at the 3 '-end by the 84 bp attP recognition site of C31-Int (Fig.l 
and 6). These attB and attP sites are located on the same DNA molecule and 
oriented in a way to each other which allows the deletion of the flanked DNA 
segment. The same orientation of attB and attP sites is used naturally by the 
<DC31 phage and the bacterial genome, leading to the integration of the phage 
genome when both sites are located on different DNA molecules (Thorpe et al., 
Proc. Natl. Acad. Sci. USA, 95, 5505 - 5510 (1998)). To measure C31-Int and 
Cre recombinase activities with the same substrate vector, pRK64 contains in 
addition two Cre recognition (loxP) sites in direct orientation next to the att sites. 
Since the att/lox-flanked cassette in plasmid pRK64 is inserted between the SV40 
promoter and the coding region of the B-galactosidase gene, its presence inhibits 
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B-galactosidase expression as the SV40 promoter derived transcripts are 
terminated at the polyadenylation signal of the puromycln gene. Plasmid pRK64 
is turned into a B-galactosidase expression vector upon C31-Int or Cre mediated 
deletion of the att/lox-flanked puromycin cassette since the remaining single att 
5 and loxP site do not substantially interfere with gene expression. 

For the expression of recombinases a mammalian expression vector was 
designed which contains the CMV immediate early promoter followed by a hybrid 
intron, the coding region of the recombinase to be tested, and an artificial 

10 polyadenylation signal sequence. The backbone sequence of the four 
recombinase expression vectors shown in Figure 1A-D is identical to each other 
except for the recombinase coding region. Plasmid pCMV-C31Int(wt) (Fig. 1A) 
contains the nonmodified (wildtype) 1.85 kb coding region of C31-Int as found in 
the genome of phage OC31 (Kuhstoss, et al., J. Mol. Biol. 222, 897-908 (1991)).. 

15 Plasmid pCMV-C31Int(NNLS) (Fig. IB) contains a modified C31-Int gene coding 
for the full length C31-Int protein with a N-terminal extension of 7 amino acids 
derived from the SV40 virus large T antigen which serves as a nuclear 
localisation signal (NLS). Plasmid pCMV-C31Int(CNI_S) (Fig. 1C) contains a C- 
terminal extension of 7 amino acids derived from the SV40 virus large T antigen 

20 which serves as a nuclear localisation signal (NLS). Plasmid pCMV-Cre (Fig. ID) 
contains the 1.1 kb Cre coding region with an N-terminal fusion of the 7 amino 
acid NLS of the SV40 T-antigen. For Cre recombinase it has been shown that the 
N-terminal addition of the SV40 T-antigen NLS does not increase its 
recombination efficiency in mammalian cells (Le et al., Nucleic Acids Res., 27, 

25 4703 - 4709(1999)). 

As a test system to compare the efficiency of the 4 recombinases the same 
amount of plasmid DNA of each of the recombinase expression vectors together 
with a fixed amount of the reporter plasmid pRK64 was transiently introduced 

30 into Chinese Hamster Ovary (CHO) cells. Thus, in this assay design the efficiency 
of the various recombinases on an extrachromosomal substrate introduced into 
the CHO cells was compared as a circular plasmid. Two days after transfection 
the cells from the various samples were lysed and the activity of B-galactosidase 
in the lysates was determined by a specific chemiluminescense assay and 

35 expressed in "Relative Light Units" (RLU (B-Gal)) (Fig. 2). In addition all samples 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

29 

contained a fixed amount of a luciferase expression vector to control for the 
experimental variation of cell transfection and lysis. For this purpose the lysates 
of each sample were also tested for luciferase activity with a specific 
chemiluminescense assay and the values expressed as "Relative Light Units" 
(RLU (Luciferase)) (Fig. 2). All transfection samples contained in addition varying 
amounts of the unrelated cloning plasmid pUC19 so that all samples were 
equalised to the same amount of plasmid DNA. As a positive control for S- 
galactosidase a derivative of the recombination reporter pRK64 was used in 
which the loxP flanked 1.1 kb cassette has been removed through Cre mediated 
recombination in E. coli giving rise to plasmid pRK64(ACre). As negative controls 
served samples which received the unrecombined reporter plasmid pRK64 but no 
recombinase expression vector as well as samples set up with the pUC19 plasmid 
alone. 

To determine the relative efficiency of the tested recombinases the RLU values of 
B-galactosidase were divided individually for each sample by the RLU values 
obtained for luciferase and multiplied with 10 5 . From the values of the four data 
points of each sample the mean value and standard deviation was calculated as 
an indicator of recombinase activity (Gal/Luc) (Fig. 2). The relative activity of the 
tested recombinases was then compared to the positive control defined as an 
activity of 1. 

As shown in Fig. 2, the expression of Cre recombinase (samples 10 and 11) 
resulted in a 150 to 170-fold increase of B-galactosidase activity as compared to 
the negative control (sample 3), demonstrating the wide dynamic range of our 
test system. Each recombinase vector was tested using two different amounts of 
DNA for transfection (0.5 and lng/sample) f which in the case of Cre resulted in 
63% and 72% recombinase activity (samples 10 and 11 as compared to the 
positive control). These two values establish that the DNA amounts used are 
close to the test systems saturation for recombinase expression as the doubling 
of DNA amounts resulted only in a minor increase of recombinase activity. 

In comparison to Cre, the expression of wildtype C31-Int resulted in a 
considerably lower recombinase activity of 23% and 30% (Fig. 2, samples 4 and 
5) as compared to the positive control. These values represent 37% and 42% 
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recombinase activity for wildtype C31-Int as compared to Cre recombinase 
(compare samples 4 and 5 with 10 and 11). Upon the expression of C31-Int 
fused with the N-terminal NLS (C31-Int(NNLS)) values of 32% and 36% 
recombinase activity (samples 6 and 7) were obtained as compared to the 
5 positive control. The C31-Int(NNLS) values represent 51% and 50% recombinase 
activity as compared to Cre (compare samples 6 and 7 to 10 and 11). Thus, the 
activity of C31-Int in mammalian cells is just moderately enhanced by the 
addition of a NLS signal. 

Surprisingly, upon the expression of C31-Int fused with the C-terminal NLS (C31- 
10 Int(CNLS)) values of 50% and 65% recombinase activity (samples 8 and 9) were 
obtained as compared to the positive control. The C31-Int(CNLS) values 
represent 79% and 90% recombinase activity as compared to Cre recombinase 
(compare samples 84 and 9 to 10 and 11). Unexpectedly,C31-Int(CNLS) exhibits 
a dramatic, more than twofold increase of recombinase activity in comparison to 
15 C31-Int(wt) (compare samples 8 and9 to 4 and 5). 

In order to test whether the addition of a NLS sequence may be a general, 
simple method to enhance recombinase activity in mammalian cells we extended 
our studies by two additional recombinases: XisA recombinase (XisA) derived 

20 from the cyanobacterium Anabaena, and SSV-Integrase (SSV-Int) derived from 
the SSV1 virus of the thermophilic bacterium Sulfolobus shibatae. To this end we 
constructed mammalian expression vectors for the wildtype forms of XisA and 
SSV recombinases and compared their activity to versions which were modified 
by the N-terminal addition of the 7 amino acid NLS of the SV40 T-antigen. These 

25 recombinases were compared by the use of the reporter vector shown in Fig. IE, 
except that the att elements of C31-Int were replaced by the hif recognition 
sequences for XisA or the att sequences for SSV-Int. As described above for C31- 
Int, recombinase activities were tested by transient transfection into CHO cells 
using the reporter vector derived B-galactosidase activity as readout and 

30 cotransfected luciferase as internal control. 

As shown in Fig.3 for both, XisA and SSV recombinases the addition of a NLS 
sequence did not improve their activity in a mammalian cell line as compared to 
the wildtype forms. At both DNA concentrations tested wildtype XisA exhibits a 
significant recombination activity as compared to the reporter vector alone 

35 (compare samples 2 and 3 to sample 1), but this activity is not altered by the 
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addition of an NLS (compare samples 2 and 3 to samples 4 and 5). SSV-Int 
exhibits only a low recombination activity (compare samples 7 and 8 with sample 
6) which is also not enhanced by the addition of a NLS (compare samples 9 and 
10 with samples 7 and 8). From these results we conclude that the addition of a 
NLS to an inefficient recombinase is not a general, simple method to improve its 
performance in mammalian cells. 

Taken together, in the transient transfection test system shown in Figure 2 a 
more than twofold activity increase of the <DC31 Integrase could be achieved by 
the C-terminal, but not the N-terminal addition of the SV40 T antigen NLS signal. 
As this signal sequence has been characterised to act as a nuclear localisation 
signal (Kalderon et. al, Cell, 39, 499 - 509 (1984)) we conclude that the 
efficiency increase of C31-Int(CNLS) is the result of the improved nuclear 
accumulation of this recombinase. The relative inefficiency of C31-Int (NNLS) 
may be explained by the inaccessibility of the NLS peptide to the nuclear import 
machinery at the N-terminal position of the C31-Int protein. 
In particular, it could be shown that C31-Int(CNLS) recombines 
extrachromosomal DNA in mammalian cells almost as efficient as the widely used 
Cre recombinase and thus provides an additional or alternative recombination 
system of highest activity. The efficiency increase of C31-Int(CNLS) as compared 
to its wildtype form is regarded as an invention of substantial use for 
biotechnology. 

Example 2 

As demonstrated in example 1 C31-Int recombinase with the C-terminal fusion of 
the SV40 T-antigen NLS (C31-Int(CNLS)) shows in mammalian cells a 
recombination activity comparable to Cre recombinase on an extrachromosomal 
plasmid vector. It was further tried to test whether C31-Int(CNLS) exhibits a 
similar activity on a recombination substrate which is chromosomally integrated 
into the genome of mammalian cells. This question is critical for the use of a 
recombination system for genome engineering as it is possible that a 
recombinase may act efficiently on extrachromosomal substrates but is impaired 
if the recognition sites are part of the mammalian chromatin. To characterise the 
recombination activity of C31-Int(CNLS) and C31-Int(NNLS) on a chromosomal 
substrate the pRK64 reporter plasmid (Fig. IE) was stably integrated, containing 
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a pair of loxP and att sites, into the genome of a mammalian cell line. One of the 
stable transfected clones was chosen for further analysis and was transiently 
transfected with recombinase expression vectors coding for C31-Int(CNLS), C31- 
Int(NNLS), C31-Int(wt) or Cre recombinase. The activity of B-galactosidase 
5 derived from the Cre expression vector recombined in these cells was taken as a 
measure of recombination efficiency. 

A. Plasmid constructions: all plasmids used and their purification are described in 
example 1. 

10 

B. Cell culture and transfections: To generate a stably transfected C31-Int 
reporter cell line 2.5 x 10 6 NIH-3T3 cells (Andersson et al., Cell, 16, 63-75 
(1979); DSMZ#ACC59; DSMZ-Deutsche Sammlung von Mikroorganismen und 
Zellkulturen GmbH, Mascheroder Weg lb, D-38124 Braunschweig, Germany) 

15 were electroporated with 5 ug pRK64 plasmid DNA linearised with Seal and 
plated into 10cm petri dishes. The cells were grown in DMEM/Glutamax medium 
(Life Technologies) supplemented with 10% fetal calf serum at 37°C, 10% C0 2 in 
humid atmosphere, and passaged upon trypsinisation. Two days after tranfection 
the medium was supplemented with lmg/ml of puromycin (Calbiochem) for the 

20 selection of stable integrants. Upon the growth of resistant colonies these were 
isolated under a stereomicroscope and Individually expanded in the absence of 
puromycin. To demonstrate stable integration of the transfected vector, genomic 
DNA of puromycin resistant clones was prepared according to standard methods 
and 5-10 ug were digested with EcoRV. Digested DNA was separated in a 0.8% 

25 agarose gel and transferred to nylon membranes (GeneScreen Plus, NEN 
DuPont) under alkaline conditions for 16 hours. The filter was dried and 
hybridised for 16 hours at 65°C with a probe representing the 5* part of the E. 
coli B-galactosidase gene (1.25 kb NotI - EcoRV fragment of plasmid CMV-6-pA 
(R. Kuhn, unpublished). The probe was radiolabeled with P32-marked a-dCTP 

30 (Amersham) using the Megaprime Kit (Amersham). Hybridisation was performed 
in a buffer consisting of 10% dextranesulfate, \% SDS, 50 mM Tris and 100 mM 
NaCI, pH7.5). After hybridisation the filter was washed with 2x SSC/1%SDS and 
exposed to BioMax MSI X-ray films (Kodak) at - 80°C. 

Transfections of the selected clone 3T3(pRK64)-3 with plasmid DNAs and the 
35 measurement of B-galactosidase activities in lysates were essentially performed 
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as described in example 1 for CHO cells, except that 32ng or 64ng of the 
recombinase expression plasmids and 218 or 186 ng of pUC19 plasmid were 
used and the pRK64 plasmid was omitted from ail samples. 

5 C. Histochemical detection of B-qalactosidase activity in transfected 3T3( PRK64V 
3 cells 

To directly demonstrate B-galactosidase expression in recombinase transfected 
cells, 10 4 3T3(pRK64)-3 cells were plated one day before transfection into each 
well of a 48-well tissue culture plate (Falcon). For the transient transfection of 

10 cells with plasmids each well received into 250 pi of medium a total amount of 
150 ng supercoiled plasmid DNA compiexed before with the FuGene6 transfection 
reagent (Roche Diagnostics GmbH, Mannheim, Germany) according to the 
manufacturers protocol. Each 150 ng DNA preparation contained 50 ng of the 
recombinase expression vector pCMV-Cre or pCMV-C31Int(CNLS) and lOOng of 

15 the pUC19 plasmid. After 2 days the culture medium was removed from the 
wells, the wells were washed once with phosphate buffered saline (PBS), and the 
cells were fixed for 5 minutes at room temperature in a solution of 2% 
formaldehyde and 1% glutaraldehyde in PBS. Next, the cells were washed twice 
with PBS and finally incubated in X-Gal staining solution for 24 hours at 37°C 

20 (staining solution: 5 mM K 3 (Fe(CN) 6 ), 5 mM l<4(Fe(CN) 6 ), 2 mM MgCI 2 , Img/ml 
X-Gal (BioMol) in PBS) until photographs were taken. 

D. Results 

To generate a. mammalian cell clone with a stable genomic integration of the 
25 C31-Int and Cre recombinase reporter plasmid pRK64, the murine fibroblast cell 
line NIH-3T3 was electroporated with linearised pRK64 DNA (Fig.lD; see also 
example 1) and subjected to selection in puromycin containing growth medium. 
Plasmid pRK64 contains in between the pair of loxP and att sites the coding 
region of the puromycin resistance gene expressed from the SV40-IE promoter. 
30 Thirty-six puromycin resistant clones were isolated and the genomic DNA of 19 
clones was analysed for the presence and copy number of the pRK64 DNA. Three 
clones, which apparently contain 2 - 4 copies of pRK64, were further 
characterised on the single cell level for the expression of 8-galactosidase upon 
transient transfection with the Cre expression vector pCMV-Cre (Fig. 1C). The cell 
35 clone with the largest proportion of B-galactosidase positive cells, 3T3(pRK64)-3, 
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was selected as most useful for the planned studies on C31-Int and Cre 
recombinase efficiency. 

To compare the efficiency of wildtype C31-Int (C31-Int(wt)), C31-Int(CNLS), 
5 C31-Int(NNLS), and Cre recombinases 32ng or 64 ng of the recombinase 
expression vectors pCMV-C31Int(wt) f pCMV-C31Int(CNLS), pCMV-C31Int(NNLS), 
or pCMV-Cre (Rg. 1 A-D) together with the unrelated cloning plasmid pUC19 
were transiently introduced into 3T3(pRK64)-3 cells, such that all samples 
contained the same amount of plasmid DNA. As a negative control a sample 

10 prepared with the pUC19 plasmid alone was used. Two days after transfection 
the cells from the various samples were lysed and the activity of 6-galactosidase 
in the lysates was determined by a specific chemiluminescense assay and 
expressed in "Relative Light Units" (RLU)(B-Gal) (Fig. 4). From the values of the 
four data points of each sample the mean value and standard deviation was 

15 calculated as an indicator of recombinase activity (Fig.4). The relative activity of 
the tested recombinases was then compared to the highest value obtained with 
the Cre expression vector, defined as an activity of 1. 

As shown in Figure 4 the expression of Cre recombinase (samples 8 and 9) 
20 resulted in a 36 to 49-fold increase of B-galactosidase activity as compared to the 
negative control (sample 1), demonstrating the dynamic range of the test system 
used. Each recombinase vector was tested using two different amounts of DNA 
for transfection (32 ng and 64 ng/sample), which in the case of Cre resulted in 
73% and 100% recombinase activity (samples 8 and 9). These two values 
25 establish that the DNA amounts used are not far from the linear scale of the test 
systems ability for recombinase expression as the twofold increase of the amount 
of DNA also resulted in a significant increase of recombinase activity. 

The expression of wildtype C31-Int (Rg. 4, samples 2 and 3) resulted in a low 
recombinase activity of 4% and 10% as compared to thevalues obtained by Cre 
transfection. (compare samples 2 and 3 with 8 and 9). This activity was only 
moderately enhanced by the expression of C31-Int(NNLS) to values of 19% and 
22% of Cre activity (compare samples 4 and 5 with samples 8 and 9). Upon the 
expression of C31-Int(CNLS) values of 48% and 78% recombinase activity were 
obtained as compared to Cre recombinase (compare samples 6 and 7 to 8 and 
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9). Hence, C31-Int(CNLS) exhibits an 12-fold higher activity than C31-Int(wt) at 
32 ng plasmid DNA (Fig.4, compare samples 6 and 2) and an 8-fold higher 
activity than C31-Int(wt) at 64 ng plasmid DNA (Fig.4, compare samples 7 and 
3). 

5 

In addition, it was aimed to directly demonstrate in situ the expression of 8- 
galactosidase in 3T3(pRK64)-3 cells after transfection with Cre or C31-Int(CNLS) 
recombinase plasmid. Two days after transfection the cells were fixed in situ and 
incubated with the histochemical X-Gal assay which detects 6-galactosidase 

10 positive cells by a blue precipitate. As shown in Figure 5 stained cells were found 
at a comparable frequency in the samples transfected with the Cre or C31- 
Int(CNLS) expression vectors but not in the nontransfected control. This result 
confirms that the 6-galactosidase activities measured by chemiluminescense 
upon recombinase transfection (Fig. 4) results from a population of individual, 

15 recombined reporter cells. 

In conclusion, upon the transient transfection of recombinase expression vectors 
into a cell line with a genomic integration of the recombination substrate vector, 
a 8 - 12-fold activity increase of the OC31 Integrase by the C-terminal fusion 
with the SV40 T-antigen NLS signal was found. As this signal sequence has been 
characterised to act as a nuclear localisation signal (Kalderon et. al, Cell, 39, 499 
- 509 (1984)), it was concluded that the dramatic efficiency increase of C31- 
Int(CNLS) is the result of the improved nuclear accumulation of this 
recombinase. The approximately tenfold activity increase of C31-Int(CNLS) upon 
expression in a cell line with a genomic integration of the substrate vector is 
considerably higher than the activity increase found upon the transient 
expression of both vectors (see example 1). Thus, a substrate vector integrated 
into the chromatin of a mammalian cell may pose more stringent requirements 
on recombinase activity to be recombined as compared to an extrachromosomal 
30 substrate. 

The dramatic activity increase of C31-Int(CNLS), as compared to its wildtype 
form, on a stable integrated substrate in mammalian cells is an invention of 
significant practical use as this recombinase is as efficient as the widely used 



20 



25 



SUBSTITUTE SHEET (RULE 26) 



W0 02/38613 PCT/EPOl/12975 

36 

Cre/loxP system; thus, C31-Int(CNLS) provides an additional or alternative 
recombination system of highest activity. 

Example 3 

To demonstrate that the increase in B-galactosidase activity obtained by the 
cotransfection of a C31-Int expression vector and the reporter vector pRK64 into 
mammalian cells is in fact the result of recombinase mediated deletion, one of 
the recombination products was detected by a specific polymerase chain reaction 
(PCR). The amplified PCR product was cloned and its sequence determined. The 
obtained sequence confirms that C31-Int mediated deletion of the test vector 
occurs in a mammalian cell line and that the recombination occurs at the known 
breakpoint within the attB and attP sites. 

A. Plasmid constructions : The construction of plasmids pRK64, pCMV-Cre and 
P CMV-C31-Int(wt) is described in Example 1. To simulate the recombination of 
PRK64 by C31-Int, the sequence between the CAA motives of the att sites 
(boxed in Rg.5) was deleted from the computerfile of P RK64, giving rise to the 
sequence of pRK64(AInt) (SEQ ID NO: 16). 

B. Transfection of Cells and PCR amp lified™- MEF5 . 5 mouse fj brob | asts 
(Schwenk et al., 1998) (20000 cells per well of a 12 well plate (Falcon)) were 
transfected with 0.5 ug pRK64 alone or together with 250 ng pCMV-Int(wt) or 
pCMV-Cre using the FuGene6 transfection reagent following the manufacturers 
protocol (Roche Diagnostics). After 2 days DNA was extracted from these cells 
according to standard methods and used for PCR amplification with Primers P64- 
1 (SEQ ID NO:17; complementary to position 111-135 of pRK64(AInt)) and P64- 
4 (SEQ ID NO: 18; complementary to position 740-714 of pRK64(AInt)) using the 
Expand High Fidelity PCR kit (Roche Diagnostics). PCR products were separated 
on a 0.8% agarose gel, extracted with the QuiaEx kit (Quiagen) and cloned into 
the pCR2.1 vector using the TA cloning kit (Invitrogen) resulting in plasmid 
pRK80d. The sequence of its insert, seq80d (SEQ ID NO: 19), was determined 
using the reverse sequencing primer and standard sequencing methods (MWG 
Biotech). 

For the measurement of B-galactosidase activity the cells were lysed 2 days after 
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transfection and the 6-galactosidase activities were determined with the B- 
galactosidase reporter gene assay (Roche Diagnostics) as described in example 
1. 



C Results; A 8 a test vector f or C31-Int mediated DNA recombination plasmid 
PRK64 was used, which contains the 1.1 Kb coding region of the puromycin 
resistance gene flanked 5' by the 84 bp attB and 3' by the 84 bp attP 
recognition site of C31-Int (Fig. 5). These attB and attP sites are located on the 
same DNA molecule and oriented in a way to each other which allows the 
deletion of the att-flanked DNA segment. The same orientation of attB and attP 
sites is used naturally by the <&C31 phage and the bacterial genome for the 
integration of the phage genome when both sites are located on different DNA 
molecules (Thorpe et al., Proc. Natl. Acad. Sci. USA, 95, 5505 - 5510 (1998)). As 
a positive control, vector pRK64 contains in addition two Cre recombinase 
recognition (loxP) sites in direct orientation next to the att sites. Since the att- 
flanked DNA segment in plasmid pRK64 is inserted between a promoter active in 
mammalian cells and the B-galactosidase gene, its deletion can be measured by 
the increase of B-galactosidase activity. The expected product of C31-Int 
mediated deletion of plasmid pRK64 is shown in Fig. 6, designated as 
pRK64(Mnt). If the recombination between attB and attP occurs as described in 
bacteria (Thorpe et al., Proc. Natl. Acad. Sci. USA, 95, 5505 - 5510 (1998)), a 
single attR site is generated and left on the parental plasmid (Fig. 6) while the 
flanked DNA is excised and contains an attL site. Beside the measurement of B- 
galactosidase activity, C31-Int mediated recombination of pRK64 can be directly 
detected on the DNA level by a specific polymerase chain reaction (PCR) using 
the primers P64-1 and P64-4 (Fig. 6). These primers, located 5>of the attB site 
(P64-1) and 3' of the attP site, are designed to amplify a PCR product of 630 bp 
lenght upon the C31-Int mediated recombination of pRK64. For the expression of 
C31-Int in mammalian cells plasmid pCMV-C31(wt) was used, which contains the 
CMV-IE-Promoter upstream of the C31-Int coding region followed by a synthetic 
polyadenylation signal (see Example 1 and Fig.l). 

The recombination substrate vector pRK64 was transiently transfected into the 
murine fibroblast cell line MEF5-5 either alone, or together with the C31-Int 
expression vector pCMV-C31(wt), or together with an expression vector for Cre 
recombinase, pCMV-Cre. Two days after transfection half the cells of each sample 
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was lysed and used to measure 6-galactosidase activity by chemiluminescense, 
and the other half was used for the preparation of DNA from the transfected cells 
for PCR analysis. The 6-galactosidase levels of the 3 samples were found as 
following (expressed as "Relative Light Units" (RLU) with standard deviation (SD) 
of the 6-galactosidase assay): 



Sample RLU fSD^ 

1) PRK64 692 ± 5 

2) pRK64 + pCMV-Cre. 8527 ± 269 

3) pRK64 + pCMV-C31(wt) 1288 +93 



As the coexpression of the test vector pRK64 together with the C31-Int 
expression vector in sample 3 leads to a significant increase of 8-galactosidase 
activity as compared to pRK64 alone, this result suggests that pRK64 is 
recombined by C31-Int as anticipated in Fig. 6. 

Next, cellular DNA was prepared from the three samples and tested for the 
occurrence of the expected Cre or C31-Int generated deletion product by PCR 
using primers P64-1 and P64-4 for amplification. As shown in Fig. 7 an 
amplification product of the expected size was found only in the samples 
cotransfected with the Cre or C31-Int recombinase expression vectors (Fig. 7A, 
Iane3 and lane 4). The PCR products amplified from pRK64 recombined by C31- 
Int or Cre are of the same size but should be recombined via the attB/P or loxP 
sites, respectively. 

To prove that the PCR product found after cotransfection of plasmids pRK64 and 
pCMV-C31(wt) represents in fact the deletion product of C31-Int mediated 
recombination, this DNA fragment was cloned into, a plasmid vector and its DNA 
sequence determined. One clone, pRK80d, was . analysed, and its sequence 
showed exactly the sequence of an attR site as expected from C31-Int mediated 
deletion of pRK64 (Fig. 7B, compare to Fig. 6). 

In conclusion, this experiment demonstrates that C31-Int mediated deletion of a 
vector containing a pair of attB/attP sites occurs in a mammalian cell line, and 
that the recombination occurs within the same 3 bp breakpoint region of attB and 
attP as found in bacteria (Thorpe et al. ; Proc. Natl. Acad. Sci. USA, 95, 5505 - 
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5510 (1998)). Thus, it was concluded that an increase of B-galactosidase activity 
observed by cotransfection of the pRK64 reporter vector and a C31-Int 
expression vector in mammalian cells truly reflects C31-Int recombinase activity. 



Example 4 

As has been demonstrated in examples 1-3, the C31-Int recombinase with the C- 
terminal fusion of the SV40 T-antigen NLS (C31-Int(CNLS)) shows a 
recombination activity comparable to Cre recombinase on an extrachromosomal 
as well as a chromosomally integrated target in mammalian cells in vitro. To test 
whether C31-Int(CNLS) exhibits activity in mice, transgenic mice carrying a C31- 
Int(CNLS) expression vector were generated. These transgenic mice were 
crossed with reporter mice carrying the recombinase substrate. Recombination- 
mediated expression of B-galactosidase, which can be measured by staining with 
the substrate X-Gal, was analyzed in testes of double transgenic progeny 
carrying both the recombinase and the reporter. 

A. Plasmid constructions: For the construction of the C31-Int(CNLS) transgene 
expression vector, the C31Int gene with C-terminal NLS was isolated as a 2 kb- 
fragment generated by restriction of pCMV-C31Int(CNLS) (SEQ ID NO: 12) with 
Bglll. The fragment was ligated into the Bglll restriction site of the vector 
pCAGGS-Cre-pA (SEQ ID NO: 104) giving rise to the plasmid pCAGGS-C31CNLS- 
pA (SEQ ID NO: 105). In pCAGGS-C31CNLS-pA the C31-Int(CNLS) (position 
1891-3753) is transcribed from the CAGGS promoter (position 1-1616) and 
followed by the SV40 late region polyadenylation sequence (position 3763-3941). 

B. Production of transgenic mice: For the embryo injections a 3.95 kb-fragment 
was generated by restriction of the plasmid pCAGGS-C31CNLS-pA with PstI and 
Ascl. This fragment was purified as follows: DNA bands were separated on an 
agarose-gel without ethidiumbromide. One part of the gel was stained with 
ethidiumbromide to locate the band to excise. The DNA was electroeluted from 
the excised band with S&S Biotrap Elution Chamber in lx TAE (40 mM Tris- 
acetate, 1 mM EDTA) overnight. The DNA was precipitated from the eluate with 
1/10 volume 3M sodium acetate and 2.5 volumes ethanol at -20 °C for several 
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hours. The DNA was pelleted by centrifugation at 13000 rpm for 30 min and 
washed twice with 70 % ethanol. The dried DNA pellet was resuspended in TE 
(10 mM Tris, 1 mM EDTA, pH 8). Subsequently the precipitation procedure was 
repeated once and the DNA resuspended in injection buffer (10 mM Tris pH 7.2, 
5 0.1 mM EDTA ). The sample was dialysed with Slide-A-Lyse Mini Dialysis Unit 
(Pierce) in injection buffer with several changes of buffer at 4°C overnight. 
Different amounts of the sample were checked on a gel to determine 
concentration. To generate transgenic mice, 5-10 fg of the purified fragment 
were injected into one pronucleus of (B6CBA)F2 mouse one-cell embryos. The 
10 injected embryos were subsequently transferred into the oviduct of 0.5 day 
pseudopregnant NMRI females. 



C, Analysis of transgenic mirp- Mice were analyzed for the presence of the 
pCAGGS-C31CNLS-pA transgene by PCR using tail DNA and the primers C31- 

15 screen 1 (SEQ ID NO: 100) and C31-screen 2 (SEQ ID NO: 101) amplifying a 
fragment of 500 bp. The PCR reaction contained 5 pi PCR buffer (Invitrogen), 2 
pi 50 mM MgCI 2 , 1.5 pi 10 mM dNTP-mix, 2 pi (10 pmol) of each primer, 0.5 pi 
Taq-polymerase (5 U/ pi) and water to a volume of 50 pi. The program used for 
the PCR reactions was: 94 °C for 30 s, 55 °C for 30 s and 72 °C for 1 min in 30 

20 cycles. 

D. Analysis of C31-IntfCNLSl activity: Fn.mHpr mice transgenic for the pCAGGS- 
C31CNLS-pA transgene were crossed to heterozygous C31 reporter mice carrying 
the C31 reporter construct in the ROSA26 locus (SEQ ID NO: 106) (Fig. 8). 

25 Offspring of the crosses were genotyped for the presence of the pCAGGS- 
C31CNLS-pA transgene by the PCR assay described in section C as well as for the 
ROSA26-C31 reporter allele by a LacZ-speclfic PCR assay. The PCR was 
performed using tail DNA and the primers p-Gal 3 (SEQ ID NO: 102) and p-Gal 4 
(SEQ ID NO: 103) amplifying a fragment of 315 bp. The PCR reaction contained 5 

30 pi PCR buffer (Invitrogen), 2.5 pi 50 mM MgCI 2 , 2 pi 10 mM dNTP-mix, 1 pi (10 
pmol) of each primer, 0.4 pi Taq-polymerase (5 U/ pi) and water to a volume of 
50 pi. The program used for the PCR reactions was: 94 °C for 1 min, 60 °C for 1 
min and 72 °C for 1 min in 30 cycles. 

Testes from mice carrying the pCAGGS-C31CNLS-pA transgene as well as the 
35 reporter locus and from a control mouse carrying the reporter allele only were 
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dissected. The tissues were imbedded in OCT Tissue freezing medium 
(Leica/Jung) and frozen in liquid nitrogen. Cryosections were generated from the 
embedded tissues using a Leica CM3050 cryomicrotome, dried on polylysine- 
coated slides for 1-4 hours and then stained as follows: Sections were fixed in 
5 0.2 % glutaraldehyde, 5 mM EGTA, 2 mM MgCI 2 in 0.1 M PB (K 2 HP0 4 / KH 2 P0 4 , pH 
7.3) for 5 min at room temperature and washed in wash buffer (2 mM MgCI 2 , 
0.02 % Nonidet-40 in PB in 0.1 M PB) 3 times for 15 min. Then sections were 
stained in X-Gal-solution (0.6 mg/ ml X-Gal in DMSO, 5 mM potassium 
hexacyanoferrat III, 5 mM potassium hexacyanoferrat II in LacZ wash buffer) 
10 overnigth at 37 °C. After staining sections were washed in lx PBS twice for 5 
min. Dehydration was performed by washing the sections first with 70 %, 96 % 
and 100 % ethanol for 2 min each, then with a 1:1 mix of ethanol and xylol for 5 
min and in the end only with xylol for 5 min. Before taking pictures sections were 
mounted in Entellan. 

15 

E. Results: T o identify transgenic founder mice carrying the pCAGGS-C31CNLS- 
pA transgene, 29 mice born from the injection experiment were analyzed for the 
presence of the transgene. 5 founder mice (3 females and 2 males) were 

20 identified. To analyze the activity of the C31-Int(CNLS) recombinase in 
transgenic mice, 2 of the female founder mice were crossed to heterozygous C31 
reporter mice carrying a C31 reporter construct in the ROSA26 locus (Fig. 8). 
From each of these crosses, one offspring carrying the pCAGGS-C31CNLS-pA 
transgene as well as the C31 reporter allele was sacrificed. In oder to determine 

25 whether pCAGGS-C31CNLS-pA transgenic mice are able to delete an attB/P 
flanked DNA sequence in the mouse germline, tissue sections from the testes of 
the sacrificed animals were prepared and stained for 6-galactosidase activity with 
X-Gal. Fig. 9 shows the result of the staining experiment for one of these mice 
(A) as well as a control mouse carrying only the reporter allele, but lacking the 

30 pCAGGS-C31CNLS-pA transgene (B). Clear staining can be detected in the 
maturing sperm cells in about 50% of the tubules with the proportion of B- 
galactosidase expressing cells ranging between 10 and 100. No staining could be 
detected for the control mouse. This clearly demonstrates that C31-int-mediated 
recombination has taken place during spermatogenesis in the pCAGGS-C31CNLS- 

35 pA transgenic mice. These results show that the C31-int is functional in vivo, in a 
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transgenic mouse system and therefore provides a new tool to introduce specific 
deletions, inversions or integrations into the mouse germline. 
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1. A fusion protein comprising 

(a) a recombinase domain comprising a recombinase protein or or a mutant 
thereof having a recombinase activity similar to that of the corresponding wild- 
type recombinase and 

(b) a signal peptide domain linked to said recombinase domain which directs 
nuclear import of said fusion protein in eucaryotic cells. 

2. The fusion protein of claim 1, wherein the activity of the fusion protein in 
eucaryotic cells is significantly higher as compared to that of the wild-type 
recombinase corresponding to the recombinase of the recombinase domain. 

3. The fusion protein of claim 1 or 2, wherein the recombinase domain comprises 
a recombinase protein belonging to the family of large serine recombinases or a 
mutant thereof, preferably the recombinase domain comprises a recombinase 
protein selected from the group consisting of bacteriophage 4>C31 integrase 
(C31-Int), coliphage P4 recombinase, Listeria phage recombinase, bacteriophage 
R4 Sre recombinase, CisA recombinase, XisF recombinase, transposon Tn4451 
TnpX recombinase and lactococcal bacteriophage TP901-1 recombinase, or a 
mutant thereof; most preferably the recombinase protein is a C31-Int protein or 
a mutant thereof. 

4. The fusion protein of claim 3, wherein the recombinase protein comprises a 
C31-Int having the amino acid sequence shown in SEQ ID NO:21 or a C-terminal 
truncated form thereof, said truncated form of the C31-Int preferably comprising 
amino acid residues of 306 to 613 of SEQ ID NO:21. 

5. The fusion protein according to any one of claims 1 to 4, wherein the signal 
peptide domain is derived from yeast GAL4, SKI3, L29 or histone H2B proteins, 
polyoma virus large T protein, VP1 or VP2 capsid protein, SV40 VP1 or VP2 capsid 
protein, adenovirus Ela or DBP protein, influenza virus NS1 protein, hepatitis 
virus core antigen or the mammalian lamin, c-myc, max, c-myb, p53, c-erbA, 
jun, Tax, steroid receptor or Mx proteins, SV40 T-antigen or other proteins with 
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known nuclear localisation, preferably the signal peptide domain comprises a 
peptide which is derived from the SV40 T-antigen. 

6. The fusion protein according to any one of claims 1 to 5, wherein the signal 
5 peptide domain s " 

(i) has a length of 5 to 74, preferably 7 to 15 amino acid residues, and/or 

(ii) comprises a segment of 6 amino acid residues having at least 2 positively 
charged basic amino acid residues, said basic amino acid residues being 
preferably selected from lysine, arginine and histidine. 

10 

7. The fusion protein of claim 5 or 6, wherein the signal peptide domain 
comprises a peptide selected from a sequence shown in SEQ ID NOs:24 to 53, 
preferably the signal peptide comprises the amino acid sequence Pro-Lys-Lys- 
Lys-Arg-Lys-Val (SEQ ID NO:53). 

15 

8. The fusion protein according to any one of claims 1 to 6, wherein 

(i) the signal peptide domain is linked to the N-terminal or C-terminal of the 
recombinase domain or is integrated into the recombinase domain, preferably the 
signal peptide domain is linked to the C-terminal of the recombinase domain; 

20 and/or 

(ii) the signal peptide domain is linked to the recombinase domain directly or 
through a linker peptide, said linker preferably having 1 to 30 essentially neutral 
amino acid residues. 

25 9. The fusion protein of claim 1 comprising the amino acid sequence shown in 
SEQ ID NO:23. 

10. A DNA coding for the fusion protein according to any one of claims 1 to 9. 
30 11. A vector containing the DNA as defined in claim 10. 

12. A microorganism containing the DNA of claim 10 and/or the vector of claim 
11. 
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13. A process for preparing the fusion protein as defined in any one of claims 1 to 
9 which comprises culturing a microorganism as defined in claim 11 under 
conditions suitable for expression of said fusion protein and recovering said 
fusion protein. 

14. Use of the fusion protein as defined in any one of claims 1 to 9 to recombine 
DNA molecules, which contain recombinase recognition sequences for the 
recombinase protein of the recombinase domain, in eucaryotic cells. 

15. A cell, preferably a mammalian cell containing the DNA sequence of claim 10 
in its genome. 

16. The cell of claim 15, also containing recognition sequences for the 
recombinase protein of the recombinase domain in its genome. 

17. Use of the cell of claim 15 or 16 for studying the function of genes and for 
the creation of transgenic organisms. 

18. A transgenic organism, preferably a transgenic non-human mammal 
containing the DNA sequence of claim 10 in its genome. 

19. The transgenic organism of claim 18 also containing recognition sequences 
for the recombinase protein of the recombinase domain in its genome. 

20. Use of the transgenic organism of claim 18 or 19 for studying gene function 
at various developmental stages. 

21. A method for recombining DNA molecules of cells or organisms containing 
recombinase recognition sequences for the recombinase protein of the 
recombinase domain as defined in claims 1 to 9, which method comprises 
supplying the cells or organisms with a fusion protein as defined in claims 1 to 9 
or with a DNA sequence of claim 10 and/or a vector of claim 11 which are 
capable of expressing said fusion protein in the cell or organism. 

22. A method for recombining a DNA molecule containing recognition sequences 
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for a recombinase protein in a eucaryotic 
the cell with a fusion protein according to 
sequences, wherein the fusion protein 
molecule. 



PCT7EP01/12975 

cell, said method comprising contacting 
claim 1 that recognizes said recognition 
catalyzes recombination of the DNA 



23. The fusion protein according to any one of claims 1 to 9 which catalyzes 
recombination at recognition sequences for the recombinase protein. 

24 A transgenic organism, preferably a transgenic non-human mammal, 
comprising a cell containing a DNA sequence coding for a recombinase fusion 
protein as defined in claims 1 to 9 and 23 in its genome. 
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SEQUENCE LISTING 

<110> Artemis Pharmaceuticals GmbH 

<120> Modified Recombinase 

<130> 012787wo/JH/ml 

10 <140> 
<141> 

<160> 108 

15 <170> Patent In Ver. 2.1 



20 



50 



55 



60 



65 



<210> 1 
<211> 86 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer C31-4 
25 <400> 1 

ttllllttlt Iftltl 9 ^ ^990t ccccgggcgc 60 

8 6 

30 <210> 2 
<211> 86 
" <212> DNA 
<213> Artificial Sequence 

35 <220> 

<223> Description of Artificial Sequence: primer C31-5 
<400> 2 

40 tlllllttll llllllllll SSS" 090 gCCC ™^ «c-9W« cgccctggcc 60 

8 6. 

<210> 3 
<211> 90 
45 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer C31-6 
<400> 3 

gatccagaag cggttttcgg gagtagtgcc ccaactgggg taacctttga gttctctcaa 60 
ttgggggcgt agggtcgccg acatgacacg * 9 g^tctctcag 60 

<210> 4 
<211> 90 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer C31-7-2 
<400> 4 

SJSSK JSS -*«*••»■ « 
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<211> 7438 
<212> DNA 
5 <213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: vector pRK64 
10 <400> 5 

™ atCa ^° ^aacgcgcg aggcagctgt ggaatgtgtg tcagttaggg tqtaaaaaot 60 
ccccaggctc cccagcaggc agaagtatgc aaagcatgci tctSaattS tcSaacca So 
S^T 9 ca ^ ca 9 aa 9 tgtgcaaagc atgcatctca attagtcagc aaccataatc llo 
15 ca^ggSac SSSK XSK5 artCC «™ * tc ^ c « t^ccglcc 2 iS 

isqsiiisrsiil 

20 tt g ata g , 9 ,,t g 9CCgCcac 9 a ccggccggcc ggtgccgcca ccatcccctg Lccacgccc 540 
So acaaggagac gaccttccat gaccgagtac aagcccacgg tgcgccfcgc 600 
cacccgcgac gacgtccccc gggccgtacg caccctcgcc gccgcgttcg ccqactaccc 660 
cgccacgcgc cacaccgtcg acccggaccg ccacatcgag cgggtcaccg agctqcaaSa 120 
actcttcctc acgcgcgtcg ggctcgacat cggcaaggtg tgggtcgcgg acgacqqcac 180 
cgcggtggcg gtctggacca cgccggagag cgtcgaagc? ggggcggSt tcqccqfaft III 

25 cggcccgcgc atggccgagt tgagcggttc ccggctggcc gcgcagcalc agSglalS 900 
a™" 05 CCgcacc 99 c ccaaggagcc cgcgtggttc Itggccaccg tcgqcgtc?c III 
gcccgaccac cagggcaagg gtctgggcag cgccgtcgtg ctccccggag tgqaqqcqqc lOM 
cgagcgcgcc ggggtgcccg ccttcctgga gacctccgcg ccccgcalcc tcccc?tc?a S 

30 SSataaS ggCttCaCcg tcaccgccga cgtcgagtgc ccgaaggacc gcgcgacc£g Xlll 
»™ 9 cgcaagcccg gtgcctgacg cccgccccac gacccgcagc gcccqaccqa 1200 

i ™ ~ sss = ™ 

35 =S5S2 

S SJEES £SS ££££ 

aactgagaga actcaaaggt taccccagtt ggggcactac Lccgalaac cqc??ctqaa JIIS 
tccataactt cgtatagcat acattatacg aag^tatacc gggcLcca? ggtcqcqaqt S 
40 c?S ggCaC * ggccgtcgt tttacaacgt cgtgactggg aILccc?gg cgScccfa 1?JS 
a C cga Sec c Sea gSSaac ^aatagc^ a^aggcccgc Je™ 

ccggcac ag SSSS £aa g ag 9 c a ££Kg a ? c ™ ~S S3 
acc?atc^a ~ tca " ct * acag-tgeac ggttacgatg cgcccatc?a caccaacgta 1N0 
5 tcgctcacat "aata^oa £2°°"?* g " CCCacgg agaatccgac gggttgtSc 2o1S 
ga?ggcgtta actcqqeq?? SateS" f aCaggaa 9 gccagacgcg aattattttt 21 00 

£¥r Prffi SSS5 s ™" sssss sssa is 

ctcgcggtga tggtgctgcg ttggagtgac ggcagttatc tggaagatca qqatatotaa 2?80 
) ^ atttt "9 tgaegtcteg ttgetgeata aaccgactac X£££X 2340 

ttgccact ^ ctttaatgat gatttcagee gcgctgtact ggaqjctgla 24o2 
gttcagatgt gcggcgagtt gcgtgactac etaegggtaa cagtttottt a?qqcaaaat 24 60 
gaaaegcagg tcgccagcgg caccgcgcct ttcggcggtg aaattatcga JglgcgtgS llto 
aK^n 9CC9 •*°Sf°Jf t « c fctacgtctg aacgtcgaaa acccgaaact gtgglgclS 25sS 
gaaatcccga atctctatcg tgcggtggtt gaactgeaca ccgccgacgg cacqctaatt 2640 
» gaagcagaag ectgegatgt cggtttccgc gaggtgcgga ttgaaaatgg tctqctqrtq 2700 
SrtSS?" 3gC T t9Ct gatt ^ a ^ c gttlaccgtc aogagcatca cSg 27sS 
S» a n 9 tggatgagca gacgatggtg caggatatcc tgctgatgaa gcagaacaac 2820 
tttaacgccg tgcgctgttc gcattatccg aaccatccgc tgtggtacac gctgtgcqac 2880 
cgctacggcc tgtatgtggt ggatgaagee aatattgaaa cccacggcat ggtqccaatS till 
» aatcgtctga ccgatgatcc gcgctggcta ccggcgatga gcgaacgcgt facgcqaSq 3000 
gtgcagcgcg ategtaatea cccgagtgtg atcatctggt cgctggggaa tgaatcagqc ll 60 
alJ-tT^ atcac * ac ? c 3ctgtatcgc tggatcaalt c?gtcga?cc t?cccgcccg !«! 
aca^f I 9 a f ggCggcg 9 agccgacacc acggccaccg atattatttg cccga?g?ac 3180 
gcgcgcgtgg atgaagacca gcccttcccg gctgtgccga aatggtcca? caaaaaatgg 3240 
ctggagagac gcgcccgctg atectttgeg aatacgccca cgcgatgqg? 3300 
aacagtcttg gcggtttcgc taaatactgg caggegttte gtcagtatcc ccqtttaclo 3^60 
ggeggctteg tctgggactg ggtggatcag tegtgatta aata^gatg^ aSggcaac 3420 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



ccgtggtcgg 
aacggtctgg 
cagcagtttt 
ttccgtcata 
gcaagcggtg 
gaactaccgc 
aacgcgaccg 
gaaaacctca 
gaaatggatt 
tttctttcac 
ttcacccgtg 
aacgcctggg 
cagtgcacgg 
catcagggga 
atggcgatta 
ctgaactgcc 
gaaaactatc 
gacatgtata 
ttgaattatg 
caacagcaac 
ctgaatatcg 
tcggcggaat 
taataataac 
attggacaaa 
ataatgtgtt 
tgatgaatgg 
ttggacaaac 
ctattgcttt 
ttcattttat 
tctacaaatg 
ttttataggt 
gaaatgtgcg 
tcatgagaca 
ttcaacattt 
ctcacccaga 
gttacatcga 
gttttccaat 
acgccgggca 
actcaccagt 
ctgccataac 
cgaaggagct 
gggaaccgga 
caatggcaac 
aacaattaat 
ttccggctgg 
tcattgcagc 
ggagtcaggc 
ttaagcattg 
ttcattttta 
tcccttaacg 
cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 
gggagcttcc 
gacttgagcg 
gcaacgcggc 
ctgcgttatc 
ctcgccgcag 
caatacgcaa 
ggtttcccga 
attaggcacc 
gcggataaca 



cttacggcgg 
tctttgccga 
tccagttccg 
gcgataacga 
aagtgcctct 
agccggagag 
catggtcaga 
gtgtgacgct 
tttgcatcga 
agatgtggat 
caccgctgga 
tcgaacgctg 
cagatacact 
aaaccttatt 
ccgttgatgt 
agctggcgca 
ccgaccgcct 
ccccgtacgt 
gcccacacca 
tgatggaaac 
acggtttcca 
tccagctgag 
cgggcagggg 
ctacctacag 
aaactactga 
gagcagtggt 
cacaactaga 
atttgtaacc 
•gtttcaggtt 
tggtatggct 
taatgtcatg 
cggaacccct 
ataaccctga 
ccgtgtcgcc 
aacgctggtg 
actggatcbc 
gatgagcact 
agagcaactc 
cacagaaaag 
catgagtgat 
aaccgctttt 
gctgaatgaa 
aacgttgcgc 
agactggatg 
ctggtttatt 
actggggcca 
aactatggat 
gtaactgtca 
atttaaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 
agggggaaac 
tcgatttttg 
ctttttacgg 
ccctgattct 
ccgaacgacc 
accgcctctc 
ctggaaagcg 
ccaggcttta 
atttcacaca 



tgattttggc 
ccgcacgccg 
tttatccggg 
gctcctgcac 
ggatgtcgct 
cgccgggcaa 
agccgggcac 
ccccgccgcg 
gctgggtaat 
tggcgataaa 
taacgacatt 
gaaggcggcg 
tgctgatgcg 
tatcagccgg 
tgaagtggcg 
ggtagcagag 
tactgccgcc 
cttcccgagc 
gtggcgcggc 
cagccatcgc 
tatggggatt 
cgccggtcgc 
ggatctttgt 
agatttaaag 
ttctaattgt 
ggaatgccag 
atgcagtgaa 
attataagct 
cagggggagg 
gattatgatc 
ataataatgg 
atttgtttat 
taaatgcttc 
cttattccct 
aaagtaaaag 
aacagcggta 
tttaaagttc 
ggtcgccgca 
catcttacgg 
aacactgcgg 
ttgcacaaca 
gccataccaa 
. aaactattaa 
gaggcggata 
gctgataaat 
gatggtaagc 
gaacgaaata 
gaccaagttt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 
tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
gtggataacc 
gagcgcagcg 
cccgcgcgtt 
ggcagtgagc 
cactttatgc 
ggaaacagct 



gatacgccga 
catccagcgc 
caaaccatcg 
tggatggtgg 
ccacaaggta 
ctctggctca 
atcagcgcct 
tcccacgcca 
aagcgttggc 
aaacaactgc 
ggcgtaagtg 
ggccattacc 
gtgctgatta 
aaaacctacc 
agcgatacac 
cgggtaaact 
tgttttgacc 
gaaaacggtc 
gacttccagt 
catctgctgc 
ggtggcgacg 
taccattacc 
gaaggaacct 
ctctaaggta 
ttgtgtattt 
atccagacat 
aaaaatgctt 
gcaataaaca 
tgtgggaggt 
tgcggccgca 
tttcttagac 
ttttctaaat 
aataatattg 
tttttgcggc 
atgctgaaga 
agatccttga 
tgctatgtgg 
tacactattc 
atggcatgac 
ccaacttact 

tgggggatca 

acgacgagcg 
ctggcgaact 
aagttgcagg 
ctggagccgg 
cctcccgtat 
gacagatcgc 
actcatatat 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
tccttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
gtattaccgc 
agtcagtgag 
ggccgattca 
gcaacgcaat 
ttccggctcg 
atgaccatga 



acgatcgcca 
tgacggaagc 
aagtgaccag 
cgctggatgg 
aacagttgat 
cagtacgcgt 
ggcagcagtg 
tcccgcatct 
aatttaaccg 
tgacgccgct 
aagcgacccg 
aggccgaagc 
cgaccgctca 
ggattgatgg 
cgcatccggc 
ggctcggatt 
gctgggatct 
tgcgctgcgg 
tcaacatcag 
acgcggaaga 
actcctggag 
agttggtctg 
tacttctgtg 
aatataaaat 
tagattccaa 
gataagatac 
tatttgtgaa 
agttaacaac 
tttttaaagc 
gggcctcgtg 
gtcaggtggc 
acattcaaat 



aaaaaggaag 
attttgcctt 
tcagttgggt 
gagttttcgc 
cgcggtatta 
tcagaatgac 
agtaagagaa 
tctgacaacg 
tgtaactcgc 
tgacaccacg 
acttactcta 
accacttctg 
tgagcgtggg 
cgtagttatc 
tgagataggt 
actttagatt 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
ctttgagtga 
cgaggaagcg 
ttaatgcagc 
taatgtgagt 
tatgttgtgt 
ttacgccaag 



gttctgtatg 3480 
aaaacaccag 3540 
cgaatacctg 3600 
taagccgctg 3660 
tgaactgcct 3720 
agtgcaaccg 3780 
gcgtctggcg 3840 
gaccaccagc 3900 
ccagtcaggc 3960 
gcgcgatcag 4020 
cattgaccct 4080 
agcgttgttg 4140 
cgcgtggcag 4200 
tagtggtcaa 4260 
gcggattggc 4320 
agggccgcaa 4380 
gccattgtca 4440 
gacgcgcgaa 4500 
ccgctacagt 4560 
aggcacatgg 4620 
cccgtcagta 4 680 
gtgtcaaaaa 4740 
gtgtgacata 4B00 
ttttaagtgt 4860 
cctatggaac 4920 
attgatgagt 4980 
atttgtgatg 5040 
aacaattgca 5100 
aagtaaaacc 5160 
atacgcctat 5220 
acttttcggg 5280 
atgtatccgc 5340 
agtatgagta 5400 
cctgtttttg 5460 
gcacgagtgg 5520 
cccgaagaac 5580 
tcccgtattg 5640 
ttggttgagt 5700 
ttatgcagtg 5760 
atcggaggac 5820 
cttgatcgtt 5880 
atgcctgtag 5940 
gcttcccggc 6000 
cgctcggccc 6060 
tctcgcggta 6120 
tacacgacgg 6180 
gcctcactga 6240 
gatttaaaac 6300 
atgaccaaaa 6360 
atcaaaggat 6420 
aaaccaccgc 6480 
aaggtaactg 6540 
ttaggccacc 6600 
ttaccagtgg 6660 
tagttaccgg 6720 
ttggagcgaa 6780 
acgcttcccg 6840 
gagcgcacga 6900 
cgccacctct 6960 
aaaaacgcca 7020 
atgttctttc 7080 
gctgataccg 7140 
gaagagcgcc 7200 
tggcacgaca 7260 
tagctcactc 7320 
ggaattgtga 7 380 
ctggcgcg 7438 
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<210> 6 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer C31-1 
<400> 6 

ataagaatgc ggccgcccga tatgacacaa ggggttgtga ccggg 45 



15 <210> 7 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
20 <220> 

<223> Description of Artificial Sequence: primer C31-3 
<400> 7 

ataagaatgc ggccgcatcc gccgctacgt cttccgtgcc 40 

<210> 8 
<211> 24 
<212> DNA 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer C31-8 
35 <400> 8 

cccgttggca ggaagcactt ccgg 24 

<210> 9 
40 <211> 55 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer C31-9 
<400> 9 

ggatcctcga gccgcgggcg gccgcctacg ccgctacgtc ttccgtgccg tcctg 55 

<210> 10 
<211> 5711 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pCMV-C31-Int (wt) 

60 <400> 10 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 
cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 
^ ^tgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 
03 tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
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cggttttggc agtacatcaa tgggcgtgga 
ctccacccca ttgacgtcaa tgggagtttg 
aaatgtcgta acaactccgc cccattgacg 
gtctatataa gcagagctct ctggctaact 
5 attaatacga ctcactatag ggagacccaa 
tgagtactcc ctctcaaaag cgggcatgac 
cgaggaggat ttgatattca cctggcccgc 
ctggtcagaa aagacaatct ttttgttgtc 
gccatacact tgagtgacat tgacatccac 

10 cagggcggcc gcccgatatg acacaagggg 
cttacgaccg tcagtcgcgc gagcgcgaga 
gtagcgccaa cgaagacaag gcggccgacc 
ggttcaggtt cgtcgggcat ttcagcgaag 
agcgcccgga gttcgaacgc atcctgaacg 

15 ttgtctatga cgtgtcgcgc ttctcgcgcc 
cggaattgct cgccctgggc gtgacgattg 
gaaacgtcat ggacctgatt cacctgatta 
cgctgaagtc ggcgaagatt ctcgacacga 
tcggcgggaa ggcgccttac ggcttcgagc 

20 acggccgaat ggtcaatgtc gtcatcaaca 
gacccttcga gttcgagccc gacgtaatcc 
aacaccttcc cttcaagccg ggcagtcaag 
tttgtaagcg catggacgct gacgccgtgc 
ccgcttcaag cgcctgggac ccggcaaccg 

25 cgggcttcgc cgctgaggtg atctacaaga 
ttgagggtta ccgcattcag cgcgacccga 
gaccgatcat cgagcccgct gagtggtatg 
gcggcaaggg gctttcccgg gggcaagcca 
agtgtggcgc cgtcatgact tcgaagcgcg 

30 gccgtcgccg gaaggtggtc gacccgtccg 
tcagcatggc ggcactcgac aagttcgttg 
ccgaaggcga cgaagagacg ttggcgcttc 
tcactgaggc gcctgagaag agcggcgaac 
ccctgaacgc ccttgaagag ctgtacgaag 

35 ttggcaggaa gcacttccgg aagcaacagg 
aagagcggct tgccgaactt gaagccgccg 
tccccgaaga cgccgacgct gacccgaccg 
tagacgacaa gcgcgtgttc gtcgggctct 
ctacgggcag ggggcaggga acgcccatcg 

40 cgccgaccga cgacgacgaa gacgacgccc 
cggcgcccgg gctcgagatc caggcgcgga 
tgtgtgttgg ttttttgtgt gccttggggg 
ggggaggggg aggccagaat gaccttgggg 
ggggaggcca gaatgaggcg cgcccccggg 

45 tacaacgtcg tgactgggaa aaccctggcg 
cccctttcgc cagctggcgt aatagcgaag 
tgcgcagcct gaatggcgaa tggcgcctga 
gtatttcaca ccgcatatgg tgcactctca 
gccagccccg acacccgcca acacccgctg 

50 catccgctta cagacaagct gtgaccgtct 
cgtcatcacc gaaacgcgcg agacgaaagg 
atgtcatgat aataatggtt tcttagacgt 
gaacccctat ttgtttattt ttctaaatac 
aaccctgata aatgcttcaa taatattgaa 

55 gtgtcgccct tattcccttt tttgcggcat 
cgctggtgaa agtaaaagat gctgaagatc 
tggatctcaa cagcggtaag atccttgaga 
tgagcacttt taaagttctg ctatgtggcg 
agcaactcgg tcgccgcata cactattctc 

60 cagaaaagca tcttacggat ggcatgacag 
tgagtgataa cactgcggcc aacttacttc 
ccgctttttt gcacaacatg ggggatcatg 
tgaatgaagc cataccaaac gacgagcgtg 
cgttgcgcaa actattaact ggcgaactac 

65 actggatgga ggcggataaa gttgcaggac 
ggtttattgc tgataaatct ggagccggtg 
tggggccaga tggtaagccc tcccgtatcg 
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5 

tagcggtttg actcacgggg atttccaagt 480 
ttttggcacc aaaatcaacg ggactttcca 540 
caaatgggcg gtaggcgtgt acggtgggag 600 
agagaaccca ctgcttactg gcttatcgaa 660 
gctgactcta gacttaatta agcgttgggg 720 
ttctgcgcta agattgtcag tttccaaaaa 780 
ggtgatgcct ttgagggtgg ccgcgtccat 840 
aagcttgagg tgtggcaggc ttgagatctg 900 
tttgcctttc tctccacagg tgtccactcc 960 
ttgtgaccgg ggtggacacg tacgcgggtg 1020 
attcgagcgc agcaagccca gcgacacagc 1080 
ttcagcgcga agtcgagcgc gacgggggcc 114 0 
cgccgggcac gtcggcgttc gggacggcgg 1200 
aatgccgcgc cgggcggctc aacatgatca 1260 
tgaaggtcat ggacgcgatt ccgattgtct 1320 
tttccactca ggaaggcgtc ttccggcagg 1380 
tgcggctcga cgcgtcgcac aaagaatctt 14 40 
agaaccttca gcgcgaattg ggcgggtacg 1500 
ttgtttcgga gacgaaggag atcacgcgca 1560 
agcttgcgca ctcgaccact ccccttaccg 1620 
ggtggtggtg gcgtgagatc aagacgcaca 1680 
ccgccattca cccgggcagc atcacggggc 1740 
cgacccgggg cgagacgatt gggaagaaga 1800 
ttatgcgaat ccttcgggac ccgcgtattg 18 60 
agaagccgga cggcacgccg accacgaaga 1920 
tcacgctccg gccggtcgag cttgattgcg 1980 
agcttcaggc gtggttggac ggcagggggc 2040 
ttctgtccgc catggacaag ctgtactgcg 2100 
gggaagaatc gatcaaggac tcttaccgct 2160 
cacctgggca gcacgaaggc acgtgcaacg 2220 
cggaacgcat cttcaacaag atcaggcacg 2280 
tgtgggaagc cgcccgacgc ttcggcaagc 2340 
gggcgaacct tgttgcggag cgcgccgacg 2400 
accgcgcggc aggcgcgtac gacggacccg 24 60 
cagcgctgac gctccggcag caaggggcgg 2520 
aagccccgaa gcttcccctt gaccaatggt 2580 
gccctaagtc gtggtggggg cgcgcgtcag 2640 
tcgtagacaa gatcgttgtc acgaagtcga 2700 
agaagcgcgc ttcgatcacg tgggcgaagc 2760 
aggacggcac ggaagacgta gcggcgtagg 2820 
tcaataaaag atcattattt tcaatagatc 2880 
agggggaggc cagaatgagg cgcggccaag 294 0 
gagggggagg ccagaatgac cttgggggag 3000 
taccgagctc gaattcactg gccgtcgttt 3060 
ttacccaact taatcgcctt gcagcacatc 3120 
aggcccgcac cgatcgccct tcccaacagt 3180 
tgcggtattt tctccttacg catctgtgcg 3240 
gtacaatctg ctctgatgcc gcatagttaa 3300 
acgcgccctg acgggcttgt ctgctcccgg 3360 
ccgggagctg catgtgtcag aggttttcac 3420 
gcctcgtgat acgcctattt ttataggtta 3480 
caggtggcac ttttcgggga aatgtgcgcg 354 0 
attcaaatat gtatccgctc atgagacaat 3600 
aaaggaagag tatgagtatt caacatttcc 3660 
tttgccttcc tgtttttgct cacccagaaa 3720 
agttgggtgc acgagtgggt tacatcgaac 3780 
gttttcgccc cgaagaacgt tttccaatga 3840 
cggtattatc ccgtattgac gccgggcaag 3900 
agaatgactt ggttgagtac tcaccagtca 3960 
taagagaatt atgcagtgct gccataacca 4020 
tgacaacgat cggaggaccg aaggagctaa 4080 
taactcgcct tgatcgttgg gaaccggagc 414 0 
acaccacgat gcctgtagca atggcaacaa 4200 
ttactctagc ttcccggcaa caattaatag 4260 
cacttctgcg ctcggccctt ccggctggct 4320 
agcgtgggtc tcgcggtatc attgcagcac 4380 
tagttatcta cacgacgggg agtcaggcaa 444 0 
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ctatggatga 
aactgtcaga 
ttaaaaggat 
agttttcgtt 
5 ctttttttct 
tttgtttgcc 
cgcagatacc 
ctgtagcacc 
gcgataagtc 

10 ggtcgggctg 
aactgagata 
cggacaggta 
ggggaaacgc 
gatttttgtg 

15 ttttacggtt 
ctgattctgt 
gaacgaccga 
cgcctctccc 
ggaaagcggg 

20 aggctttaca 
ttcacacagg 
cctgcaggtt 



acgaaataga 
ccaagtttac 
ctaggtgaag 
ccactgagcg 
gcgcgtaatc 
ggatcaagag 
aaatactgtc 
gcctacatac 
gtgtcttacc 
aacggggggt 
cctacagcgt 
tccggtaagc 
ctggtatctt 
atgctcgtca 
cctggccttt 
ggataaccgt 
gcgcagcgag 
cgcgcgttgg 
cagtgagcgc 
ctttatgctt 
aaacagctat 
t 



cagatcgctg 
tcatatatac 
atcctttttg 
tcagaccccg 
tgctgcttgc 
ctaccaactc 
cttctagtgt 
ctcgctctgc 
gggttggact 
tcgtgcacac 
gagctatgag 
ggcagggtcg 
tatagtcctg 
ggggggcgga 
tgctggcctt 
attaccgcct 
tcagtgagcg 
ccgattcatt 
aacgcaatta 
ccggctcgta 
gaccatgatt 



agataggtgc 
tttagattga 
ataatctcat 
tagaaaagat 
aaacaaaaaa 
tttttccgaa 
agccgtagtt 
taatcctgtt 
caagacgata 
agcccagctt 
aaagcgccac 
gaacaggaga 
tcgggtttcg 
gcctatggaa 
ttgctcacat 
ttgagtgagc 
aggaagcgga 
aatgcagctg 
atgtgagtta 
tgttgtgtgg 
acgccaagct 



ctcactgatt 
tttaaaactt 
gaccaaaatc 
caaaggatct 
accaccgcta 
ggtaactggc 
aggccaccac 
accagtggct 
gttaccggat 
ggagcgaacg 
gcttcccgaa 
gcgcacgagg 
ccacctctga 
aaacgccagc 
gttctttcct 
tgataccgct 
agagcgccca 
gcacgacagg 
gctcactcat 
aattgtgagc 
agcccgggct 



aagcattggt 
catttttaat 
ccttaacgtg 
tcttgagatc 
ccagcggtgg 
ttcagcagag 
ttcaagaact 
gctgccagtg 
aaggcgcagc 
acctacaccg 
gggagaaagg 
gagcttccag 
cttgagcgtc 
aacgcggcct 
gcgttatccc 
cgccgcagcc 
atacgcaaac 
tttcccgact 
taggcacccc 
ggataacaat 
agcttgcatg 



4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5711 



25 



30 



35 



<210> 11 
<211> 69 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; primer C31-2-2 
<400> 11 

tagaattccg ctcgagagtc taaaccttcc tcttcttctt aggcgccgct acgtcttccg 60 
tgccgtcct ~ " " 69 



<210> 12 
<211> 5723 
40 <212> DNA 

<213> Artificial Sequence 



45 



<220> 

<223> Description of Artificial Sequence: vector 
pCMV-C31-Int (CNLS) 



<400> 12 

cctgcaggtt 

ctagttatta 

50 gcgttacata 
tgacgtcaat 
aatgggtgga 
caagtacgcc 
acatgacctt 

55 ccatggtgat 
gatttccaag 
gggactttcc 
tacggtggga 
ggcttatcga 

60 aagcgttggg 
gtttccaaaa 
gccgcgtcca 
cttgagatct 
gtgtccactc 

65 gtacgcgggt 
agcgacacag 
cgacgggggc 



taaacagtcc 
atagtaatca 
acttacggta 
aatgacgtat 
ctatttacgg 
ccctattgac 
atgggacttt 
gcggttttgg 
tctccacccc 
aaaatgtcgt 
ggtctatata 
aattaatacg 
gtgagtactc 
acgaggagga 
tctggtcaga 
ggccatacac 
ccagggcggc 
gcttacgacc 
cgtagcgcca 
cggttcaggt 



gatgtacggg 
attacggggt 
aatggcccgc 
gttcccatag 
taaactgccc 
gtcaatgacg 
cctacttggc 
cagtacatca 
attgacgtca 
aacaactccg 
agcagagctc 
actcactata 
cctctcaaaa 
tttgatattc 
aaagacaatc 
ttgagtgaca 
cgcccgatat 
gtcagtcgcg 
acgaagacaa 
tcgtcgggca 



ccagatatac 
cattagttca 
ctggctgacc 
taacgccaat 
acttggcagt 
gtaaatggcc 
agtacatcta 
atgggcgtgg 
atgggagttt 
ccccattgac 
tctggctaac 
gggagaccca 
gcgggcatga 
acctggcccg 
tttttgttgt 
ttgacatcca 
gacacaaggg 
cgagcgcgag 
ggcggccgac 
tttcagcgaa 



gcgttgacat 
tagcccatat 
gcccaacgac 
agggactttc 
acatcaagtg 
cgcctggcat 
cgtattagtc 
atagcggttt 
gttttggcac 
gcaaatgggc 
tagagaaccc 
agctgactct 
cttctgcgct 
cggtgatgcc 
caagcttgag 
ctttgccttt 
gttgtgaccg 
aattcgagcg 
cttcagcgcg 
gcgccgggca 



tgattattga 
atggagttcc 
ccccgcccat 
cattgacgtc 
tatcatatgc 
tatgcccagt 
atcgctatta 
gactcacggg 
caaaatcaac 
ggtaggcgtg 
actgcttact 
agacttaatt 
aagattgtca 
tttgagggtg 
gtgtggcagg 
ctctccacag 
gggtggacac 
cagcaagccc 
aagtcgagcg 
cgtcggcgtt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



cgggacggcg 

caacatgatc 

tccgattgtc 

cttccggcag 

caaagaatct 

gggcgggtac 

gatcacgcgc 

tccccttacc 

caagacgcac 

catcacgggg 

tgggaagaag 

cccgcgtatt 

gaccacgaag 

gcttgattgc 

cggcaggggg 

gctgtactgc 

ctcttaccgc 

cacgtgcaac 

gatcaggcac 

cttcggcaag 

gcgcgccgac 

cgacggaccc 

gcaaggggcg 

tgaccaatgg 

gcgcgcgtca 

cacgaagtcg 

gtgggcgaag 

agcggcgcct 

aagatcatta 

ggccagaatg 

aggccagaat 

ctcgaattca 

acttaatcgc 

caccgatcgc 

ttttctcctt 

ctgctctgat 

ctgacgggct 

ctgcatgtgt 

gatacgccta 

cacttttcgg 

tatgtatccg 

gagtatgagt 

tcctgttttt 

tgcacgagtg 

ccccgaagaa 

atcccgtatt 

cttggttgag 

attatgcagt 

gatcggagga 

ccttgatcgt 

gatgcctgta 

agcttcccgg 

gcgctcggcc 

gtctcgcggt 

ctacacgacg 

tgcctcactg 

tgatttaaaa 

catgaccaaa 

gatcaaagga 

aaaaccaccg 

gaaggtaact 

gttaggccac 

gttaccagtg 

atagttaccg 

cttggagcga 

cacgcttccc 

agagcgcacg 



gagcgcccgg 
attgtctatg 
tcggaattgc 
ggaaacgtca 
tcgctgaagt 
gtcggcggga 
aacggccgaa 
ggacccttcg 
aaacaccttc 
ctttgtaagc 
accgcttcaa 
gcgggcttcg 
attgagggtt 
ggaccgatca 
cgcggcaagg 
gagtgtggcg 
tgccgtcgcc 
gtcagcatgg 
gccgaaggcg 
ctcactgagg 
gccctgaacg 
gttggcagga 
gaagagcggc 
ttccccgaag 
gtagacgaca 
actacgggca 
ccgccgaccg 
aagaagaaga 
ttttcaatag 
aggcgcggcc 
gaccttgggg 
ctggccgtcg 
cttgcagcac 
ccttcccaac 
acgcatctgt 
gccgcatagt 
tgtctgctcc 
cagaggtttt 
tttttatagg 
ggaaatgtgc 
ctcatgagac 
attcaacatt 
gctcacccag 
ggttacatcg 
cgttttccaa 
gacgccgggc 
tactcaccag 
gctgccataa 
ccgaaggagc 
tgggaaccgg 
gcaatggcaa 
caacaattaa 
cttccggctg 
atcattgcag 
gggagtcagg 
attaagcatt 
cttcattttt 
atcccttaac 
tcttcttgag 
ctaccagcgg 
ggcttcagca 
cacttcaaga 
gctgctgcca 
gataaggcgc 
acgacctaca 
gaagggagaa 
agggagcttc 



agttcgaacg 

acgtgtcgcg 

tcgccctggg 

tggacctgat 

cggcgaagat 

aggcgcctta 

tggtcaatgt 

agttcgagcc 

ccttcaagcc 

gcatggacgc 

gcgcctggga 

ccgctgaggt 

accgcattca 

tcgagcccgc 

ggctttcccg 

ccgtcatgac 

ggaaggtggt 

cggcactcga 

acgaagagac 

cgcctgagaa 

cccttgaaga 

agcacttccg 

ttgccgaact 

acgccgacgc 

agcgcgtgtt 

gggggcaggg 

acgacgacga 

ggaaggttta 

atctgtgtgt 

aagggggagg 

gagggggagg 

ttttacaacg 

atcccccttt 

agttgcgcag 

gcggtatttc 

taagccagcc 

cggcatccgc 

caccgtcatc 

ttaatgtcat 

gcggaacccc 

aataaccctg 

tccgtgtcgc 

aaacgctggt 

aactggatct 

tgatgagcac 

aagagcaact 

tcacagaaaa 

ccatgagtga 

taaccgcttt 

agctgaatga 

caacgttgcg 

tagactggat 

gctggtttat 

cactggggcc 

caactatgga 

ggtaactgtc 

aatttaaaag 

gtgagttttc 

atcctttttt 

tggtttgttt 

gagcgcagat 

actctgtagc 

gtggcgataa 

agcggtcggg 

ccgaactgag 

aggcggacag 

cagggggaaa 



catcctgaac 

cttctcgcgc 

cgtgacgatt 

tcacctgatt 

tctcgacacg 

cggcttcgag 

cgtcatcaac 

cgacgtaatc 

gggcagtcaa 

tgacgccgtg 

cccggcaacc 

gatctacaag 

gcgcgacccg 

tgagtggtat 

ggggcaagcc 

ttcgaagcgc 

cgacccgtcc 

caagttcgtt 

gttggcgctt 

gagcggcgaa 

gctgtacgaa 

gaagcaacag 

tgaagccgcc 

tgacccgacc 

cgtcgggctc 

aacgcccatc 

agacgacgcc 

gactctcgag 

tggttttttg 

gggaggccag 

ccagaatgag 

tcgtgactgg 

cgccagctgg 

cctgaatggc 

acaccgcata 

ccgacacccg 

ttacagacaa 

accgaaacgc 

gataataatg 

tatttgttta 

ataaatgctt 

ccttattccc 

gaaagtaaaa 

caacagcggt 

ttttaaagtt 

cggtcgccgc 

gcatcttacg 

taacactgcg 

tttgcacaac 

agccatacca 

caaactatta 

ggaggcggat 

tgctgataaa 

agatggtaag 

tgaacgaaat 

agaccaagtt 

gatctaggtg 

gttccactga 

tctgcgcgta 

gccggatcaa 

accaaatact 

accgcctaca 

gtcgtgtctt 

ctgaacgggg 

atacctacag 

gtatccggta 

cgcctggtat 



gaatgccgcg 

ctgaaggtca 

gtttccactc 

atgcggctcg 

aagaaccttc 

cttgtttcgg 

aagcttgcgc 

cggtggtggt 

gccgccattc 

ccgacccggg 

gttatgcgaa 

aagaagccgg 

atcacgctcc 

gagcttcagg 

attctgtccg 

ggggaagaat 

gcacctgggc 

gcggaacgca 

ctgtgggaag 

cgggcgaacc 

gaccgcgcgg 

gcagcgctga 

gaagccccga 

ggccctaagt 

ttcgtagaca 

gagaagcgcg 

caggacggca 

atccaggcgc 

tgtgccttgg 

aatgaccttg 

gcgcgccccc 

gaaaaccctg 

cgtaatagcg 

gaatggcgcc 

tggtgcactc 

ccaacacccg 

gctgtgaccg 

gcgagacgaa 

gtttcttaga 

tttttctaaa 

caataatatt 

ttttttgcgg 

gatgctgaag 

aagatccttg 

ctgctatgtg 

atacactatt 

gatggcatga 

gccaacttac 

atgggggatc. 

aacgacgagc 

actggcgaac 

aaagttgcag 

tctggagccg 

ccctcccgta 

agacagatcg 

tactcatata 

aagatccttt 

gcgtcagacc 

atctgctgct 

gagctaccaa 

gtccttctag 

tacctcgctc 

accgggttgg 

ggttcgtgca 

cgtgagctat 

agcggcaggg 

ctttatagtc 



ccgggcggct 

tggacgcgat 

aggaaggcgt 

acgcgtcgca 

agcgcgaatt 

agacgaagga 

actcgaccac 

ggcgtgagat 

acccgggcag 

gcgagacgat 

tccttcggga 

acggcacgcc 

ggccggtcga 

cgtggttgga 

ccatggacaa 

cgatcaagga 

agcacgaagg 

tcttcaacaa 

ccgcccgacg 

ttgttgcgga 

caggcgcgta 

cgctccggca 

agcttcccct 

cgtggtgggg 

agatcgttgt 

cttcgatcac 

cggaagacgt 

ggatcaataa 

gggaggggga 

ggggaggggg 

gggtaccgag 

gcgttaccca 

aagaggcccg 

tgatgcggta 

tcagtacaat 

ctgacgcgcc 

tctccgggag 

agggcctcgt 

cgtcaggtgg 

tacattcaaa 

gaaaaaggaa 

cattttgcct 

atcagttggg 

agagttttcg 

gcgcggtatt 

ctcagaatga 

cagtaagaga 

ttctgacaac 

atgtaactcg 

gtgacaccac 

tacttactct 

gaccacttct 

gtgagcgtgg 

tcgtagttat 

ctgagatagg 

tactttagat 

ttgataatct 

ccgtagaaaa 

tgcaaacaaa 

ctctttttcc 

tgtagccgta 

tgctaatcct 

actcaagacg 

cacagcccag 

gagaaagcgc 

tcggaacagg 

ctgtcgggtt 



1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 
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tcgocacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 5280 
gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc IctttgcS 5340 
a^ET" Cc ^ ttat cccctgattc tgtggatlac cgtattaccg ccttcgag^g 5400 
agctgatacc gctcgccgca gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc 5460 
ggaagagcgc ccaatacgca aaccgcctct ccccgcgcgt tggccgattc attaatgcag 5520 
fltlS 09 *? aggtttcccg actggaaagc gggcagtgag cgcaacgcaa' ttaatgtgag 5580 
ttagctcact cattaggcac cccaggcttt acactttatg cttccggctc gtatgttgtg 5640 
tggaattgtg agcggataac aatttcacac aggaaacagc tatgaccatg attacgccaa 5700 
gctagcccgg gctagcttgc atg " * 5?23 



<210> 13 
<211> 4960 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pCMV-Cre 

<400> 13 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 
cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 
atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 
ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 
cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 
cagggcggcc tcgaccatgc ccaagaagaa gaggaaggtg tccaatttac tgaccgtaca 1020 
ccaaaatttg cctgcattac cggtcgatgc aacgagtgat gaggttcgca agaacctgat 1080 
ggacatgttc agggatcgcc aggcgttttc tgagcatacc tggaaaatgc ttctgtccgt 1140 
ttgccggtcg tgggcggcat ggtgcaagtt gaataaccgg aaatggtttc ccgcagaacc 1200 
tgaagatgtt cgcgattatc ttctatatct tcaggcgcgc ggtctggcag taaaaactat 1260 
ccagcaacat ttgggccagc taaacatgct tcatcgtcgg tccgggctgc cacgaccaag 1320 
tgacagcaat gctgtttcac tggttatgcg gcggatccga aaagaaaacg ttgatgccgg 1380 
tgaacgtgca aaacaggctc tagcgttcga acgcactgat ttcgaccagg ttcgttcact 1440 
catggaaaat agcgatcgct gccaggatat acgtaatctg gcatttctgg ggattgctta 1500 
taacaccctg ttacgtatag ccgaaattgc caggatcagg gttaaagata tctcacgtac 1560 
tgacggtggg agaatgttaa tccatattgg cagaacgaaa acgctggtta gcaccgcagg 1620 
tgtagagaag gcacttagcc tgggggtaac taaactggtc gagcgatgga tttccgtctc 1680 
tggtgtagct gatgatccga ataactacct gttttgccgg gtcagaaaaa atggtgttgc 1740 
cgcgccatct gccaccagcc agctatcaac tcgcgccctg gaagggattt ttgaagcaac 1800 
tcatcgattg atttacggcg ctaaggatga ctctggtcag agatacctgg cctggtctgg 1860 
acacagtgcc cgtgtcggag ccgcgcgaga tatggcccgc gctggagttt caataccgga 1920 
gatcatgcaa gctggtggct ggaccaatgt aaatattgtc atgaactata tccgtaacct 1980 
ggatagtgaa acaggggcaa tggtgcgcct gctggaagat ggcgattagc cattaacgcg 2040 
taaatgattg cagatccact agttctaggg ccgcgtcgac ctcgagatcc aggcgcggat 2100 
caataaaaga tcattatttt caatagatct gtgtgttggt tttttgtgtg ccttggggga 2160 
gggggaggcc agaatgaggc gcggccaagg gggaggggga ggccagaatg accttggggg 2220 
agggggaggc cagaatgacc ttgggggagg gggaggccag aatgaggcgc gcccccgggt 2280 
accgagctcg aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 2340 
tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 2400 
ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat ggcgcctgat 24 60 
gcggtatttt ctccttacgc atctgtgcgg tatttcacac cgcatatggt gcactctcag 2520 
tacaatctgc tctgatgccg catagttaag ccagccccga cacccgccaa cacccgctga 2580 
cgcgccctga cgggcttgtc tgctcccggc atccgcttac agacaagctg tgaccgtctc 2640 
cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg aaacgcgcga gacgaaaggg 2700 
cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt cttagacgtc 2760 
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aggtggcact 
ttcaaatatg 
aaggaagagt 
ttgccttcct 
5 gttgggtgca 
ttttcgcccc 
ggtattatcc 
gaatgacttg 
aagagaatta 

10 gacaacgatc 
aactcgcctt 
caccacgatg 
tactctagct 
acttctgcgc 

15 gcgtgggtct 
agttatctac 
gataggtgcc 
ttagattgat 
taatctcatg 

20 agaaaagatc 
aacaaaaaaa 
ttttccgaag 
gccgtagtta 
aatcctgtta 

25 aagacgatag 
gcccagcttg 
aagcgccacg 
aacaggagag 
cgggtttcgc 

30 cctatggaaa 
tgctcacatg 
tgagtgagct 
ggaagcggaa 
atgcagctgg 

35 tgtgagttag 
gttgtgtgga 
cgccaagcta 



tttcggggaa 
tatccgctca 
atgagtattc 
gtttttgctc 
cgagtgggtt 
gaagaacgtt 
cgtattgacg 
gttgagtact 
tgcagtgctg 
ggaggaccga 
gatcgttggg 
cctgtagcaa 
tcccggcaac 
tcggcccttc 
cgcggtatca 
acgacgggga 
tcactgatta 
ttaaaacttc 
accaaaatcc 
aaaggatctt 
ccaccgctac 
gtaactggct 
ggccaccact 
ccagtggctg 
ttaccggata 
gagcgaacga 
cttcccgaag 
cgcacgaggg 
cacctctgac 
aacgccagca 
ttctttcctg 
gataccgctc 
gagcgcccaa 
cacgacaggt 
ctcactcatt 
attgtgagcg 
gcccgggcta 



atgtgcgcgg 
tgagacaata 
aacatttccg 
acccagaaac 
acatcgaact 
ttccaatgat 
ccgggcaaga 
caccagtcac 
ccataaccat 
aggagctaac 
aaccggagct 
tggcaacaac 
aattaataga 
cggctggctg 
ttgcagcact 
gtcaggcaac 
agcattggta 
atttttaatt 
cttaacgtga 
cttgagatcc 
cagcggtggt 
tcagcagagc 
tcaagaactc 
ctgccagtgg 
aggcgcagcg 
cctacaccga 
ggagaaaggc 
agcttccagg 
ttgagcgtcg 
acgcggcctt 
cgttatcccc 
gccgcagccg 
tacgcaaacc 
ttcccgactg 
aggcacccca 
gataacaatt 
gcttgcatgc 



aacccctatt 
accctgataa 
tgtcgccctt 
gctggtgaaa 
ggatctcaac 
gagcactttt 
gcaactcggt 
agaaaagcat 
gagtgataac 
cgcttttttg 
gaatgaagcc 
gttgcgcaaa 
ctggatggag 
gtttattgct 
ggggccagat 
tatggatgaa 
actgtcagac 
taaaaggatc 
gttttcgttc 
tttttttctg 
ttgtttgccg 
gcagatacca 
tgtagcaccg 
cgataagtcg 
gtcgggctga 
actgagatac 
ggacaggtat 
gggaaacgcc 
atttttgtga 
tttacggttc 
tgattctgtg 
aacgaccgag 
gcctctcccc 
gaaagcgggc 
ggctttacac 
tcacacagga 
ctgcaggttt 



tgtttatttt 
atgcttcaat 
attccctttt 
gtaaaagatg 
agcggtaaga 
aaagttctgc 
cgccgcatac 
cttacggatg 
actgcggcca 
cacaacatgg 
ataccaaacg 
ctattaactg 
gcggataaag 
gataaatctg 
ggtaagccct 
cgaaatagac 
caagtttact 
taggtgaaga 
cactgagcgt 
cgcgtaatct 
gatcaagagc 
aatactgtcc 
cctacatacc 
tgtcttaccg 
acggggggtt 
ctacagcgtg 
ccggtaagcg 
tggtatcttt 
tgctcgtcag. 
ctggcctttt 
gataaccgta 
cgcagcgagt 
gcgcgttggc 
agtgagcgca 
tttatgcttc 
aacagctatg 



tctaaataca 
aatattgaaa 
ttgcggcatt 
ctgaagatca 
tccttgagag 
tatgtggcgc 
actattctca 
gcatgacagt 
acttacttct 
gggatcatgt 
acgagcgtga 
gcgaactact 
ttgcaggacc 
gagccggtga 
cccgtatcgt 
agatcgctga 
catatatact 
tcctttttga 
cagaccccgt 
gctgcttgca 
taccaactct 
ttctagtgta 
tcgctctgct 
ggttggactc 
cgtgcacaca 
agctatgaga 
gcagggtcgg 
atagtcctgt 
gggggcggag 
gctggccttt 
ttaccgcctt 
cagtgagcga 
cgattcatta 
acgcaattaa 
cggctcgtat 
accatgatta 



2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4960 



40 <210> 14 

<211> 3858 
<212> DNA 

<213> Artificial Sequence 
45 <220> 



<223> Description of Artificial Sequence: vector pRK50 



<400> 14 
aaacagtccg 

50 tagtaatcaa 
cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 

55 tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 

60 attaatacga 
tgagtactcc 
cgaggaggat 
ctggtcagaa 
gccatacact 

65 cagggcggcc 
atagatctgt 
ggccaagggg 



atgtacgggc 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctct 
ctcactatag 
ctctcaaaag 
ttgatattca 
aagacaatct 
tgagtgacat 
gcgtcgacct 
gtgttggttt 
gagggggagg 



cagatatacg 
attagttcat 
tggctgaccg 
aacgccaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
ctggctaact 
ggagacccaa 
cgggcatgac 
cctggcccgc 
ttttgttgtc 
tgacatccac 
cgagatccag 
tttgtgtgcc 
ccagaatgac 



cgttgacatt 
agcccatata 
cccaacgacc 

gggactttcc 

catcaagtgt 
gcctggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
agagaaccca 
gctgactcta 
ttctgcgcta 
ggtgatgcct 
aagcttgagg 
tttgcctttc 
gcgcggatca 
ttgggggag;g 
Cttgggggag 



gattattgac 
tggagttccg 
cccgcccatt 
attgacgtca 
atcatatgcc 
atgcccagta 
tcgctattac 
actcacgggg 
aaaatcaacg 
gtaggcgtgt 
ctgcttactg 
gacttaatta 
agattgtcag 
ttgagggtgg 
tgtggcaggc 
tctccacagg 
ataaaagatc 
gggaggccag 
ggggaggcca 



tagttattaa 
cgttacataa 
gacgtcaata 
atgggtggac 
aagtacgccc 
catgacctta 
catggtgatg 
atttccaagt 
ggactttcca 
acggtgggag 
gcttatcgaa 
agcgttgggg 
tttccaaaaa 
ccgcgtccat 
ttgagatctg 
tgtccactcc 
attattttca 
aatgaggcgc 
gaatgacctt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT/EPO 1/12975 



gggggagggg 
gtcgttttac 
gcacatcccc 
caacagttgc 
5 ctgtgcggta 
tagttaagcc 
ctcccggcat 
ttttcaccgt 
taggttaatg 
10 gtgcgcggaa 
agacaataac 
catttccgtg 
ccagaaacgc 
atcgaactgg 

15 ccaatgatga 
gggcaagagc 
ccagtcacag 
ataaccatga 
gagctaaccg 

20 ccggagctga 
gcaacaacgt 
ttaatagact 
gctggctggt 
gcagcactgg 

25 caggcaacta 
cattggtaac 
ttttaattta 
taacgtgagt 
tgagatcctt 

30 gcggtggttt 
agcagagcgc 
aagaactctg 
gccagtggcg 
gcgcagcggt 

35 tacaccgaac 
agaaaggcgg 
cttccagggg 
gagcgtcgat 
gcggcctttt 

40 ttatcccctg 
cgcagccgaa 
cgcaaaccgc 
cccgactgga 
gcaccccagg 

45 taacaatttc 
ttgcatgcct 



gaggccagaa 

aacgtcgtga 

ctttcgccag 

gcagcctgaa 

tttcacaccg 

agccccgaca 

ccgcttacag 

catcaccgaa 

tcatgataat 

cccctatttg 

cctgataaat 

tcgcccttat 

tggtgaaagt 

atctcaacag 

gcacttttaa 

aactcggtcg 

aaaagcatct 

gtgataacac 

cttttttgca 

atgaagccat 

tgcgcaaact 

ggatggaggc 

ttattgctga 

ggccagatgg 

tggatgaacg 

tgtcagacca 

aaaggatcta 

tttcgttcca 

tttttctgcg 

gtttgccgga 

agataccaaa 

tagcaccgcc 

ataagtcgtg 

cgggctgaac 

tgagatacct 

acaggtatcc 

gaaacgcctg 

ttttgtgatg 

tacggttcct 

attctgtgga 

cgaccgagcg 

ctctccccgc 

aagcgggcag 

ctttacactt 

acacaggaaa 

gcaggttt 



tgaggcgcgc 
ctgggaaaac 
ctggcgtaat 
tggcgaatgg 
catatggtgc 
cccgccaac ; a 
acaagctgt% 
acgcgcgaga 
aatggtttct 
tttatttttc 
gcttcaataa 
tccctttttt 
aaaagatgct 
cggtaagatc 
agttctgcta 
ccgcatacac 
tacggatggc 
tgcggccaac 
^caacatgggg 
■£ccaaacgac 
attaactggc 
gi^Jtaaagtt 
taaatctgga 
taagccctcc 
aaatagacag 
agtttactca 
ggtgaagatc 
ctgagcgtca 
cgtaatctgc 
tcaagagcta 
tactgtcctt 
tacatacctc 
tcttaccggg 

ggggggttcg 

acagcgtgag 
ggtaagcggc 
gtatctttat 
ctcgtcaggg 
ggccttttgc 
taaccgtatt 
cagcgagtca 
gcgttggccg 
tgagcgcaac 
tatgcttccg 
cagctatgac 



10 

ccccgggtac 

cctggcgtta 

agcgaagagg 

cgcctgatgc 

actctcagta 

cccgctgacg 

accgtctccg 

cgaaagggcc 

tagacgtcag 

taaatacatt 

tattgaaaaa 

gcggcatttt 

gaagatcagt 

cttgagagtt 

tgtggcgcgg 

tattctcaga 

atgacagtaa 

ttacttctga 

gatcatgtaa 

gagcgtgaca 

gaactactta 

gcaggaccac 

gccggtgagc 

cgtatcgtag 

atcgctgaga 

tatatacttt 

ctttttgata 

gaccccgtag 

tgcttgcaaa 

ccaactcttt 

ctagtgtagc 

gctctgctaa 

ttggactcaa 

tgcacacagc 

ctatgagaaa 

agggtcggaa 

agtcctgtcg 

gggcggagcc 

tggccttttg 

accgcctttg 

gtgagcgagg 

attcattaat 

gcaattaatg 

gctcgtatgt 

catgattacg 



cgagctcgaa 

cccaacttaa 

cccgcaccga 

ggtattttct 

caatctgctc 

cgccctgacg 

ggagctgcat 

tcgtgatacg 

gtggcacttt 

caaatatgta 

ggaagagtat 

gccttcctgt 

tgggtgcacg 

ttcgccccga 

tattatcccg 

atgacttggt 

gagaattatg 

caacgatcgg 

ctcgccttga 

ccacgatgcc 

ctctagcttc 

ttctgcgctc 

gtgggtctcg 

ttatctacac 

taggtgcctc 

agattgattt 

atctcatgac 

aaaagatcaa 

caaaaaaacc 

ttccgaaggt 

cgtagttagg 

tcctgttacc 

gacgatagtt 

ccagcttgga 

gcgccacgct 

caggagagcg 

ggtttcgcca 

tatggaaaaa 

ctcacatgtt 

agtgagctga 

aagcggaaga 

gcagctggca 

tgagttagct 

tgtgtggaat 

ccaagctagc 



ttcactggcc 
tcgccttgca 
tcgcccttcc 
ccttacgcat 
tgatgccgca 
ggcttgtctg 
gtgtcagagg 
cctattttta 
tcggggaaat 
tccgctcatg 
gagtattcaa 
ttttgctcac 
agtgggttac 
agaacgtttt 
tattgacgcc 
tgagtactca 
cagtgctgcc 
aggaccgaag 
tcgttgggaa 
tgtagcaatg 
ccggcaacaa 
ggcccttccg 
cggtatcatt 
gacggggagt 
actgattaag 
aaaacttcat 
caaaatccct 
aggatcttct 
accgctacca 
aactggcttc 
ccaccacttc 
agtggctgct 
accggataag 
gcgaacgacc 
tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
ctttcctgcg 
taccgctcgc 
gcgcccaata 
cgacaggttt 
cactcattag 
tgtgagcgga 
ccgggctagc 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1*740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3858 



50 



55 



60 



65 



<210> 15 
<211> 6257 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pRK64(deltaCre) 



vector 



<400> 15 

cgtcatcacc 

ccccaggctc 

ggctccccag 

ccgcccctaa 

catggctgac 

tcgacgacac 

tgaggtggag 

cgagaccgtc 

ggccaccatg 



gaaacgcgcg 
cccagcaggc 
caggcagaag 
ctccgcccat 
taattttttt 
tgcagagacc 
tacgcgcccg 
acgaatagat 
gtcgcgagta 



aggcagctgt 
agaagtatgc 
tgtgcaaagc 
cccgccccta 
tatttatgca 
tacttcacta 
gggagcccaa 
ccataacttc 
gcttggcact 



ggaatgtgtg 
aaagcatgca 
atgcatctca 
actccgccca 
gaggccgagg 
acaaccggta 
gggcacgccc 
gtatagcata 
ggccgtcgtt 



tcagttaggg 
tctcaattag 
attagtcagc 
gttccgccca 
ccgcctcggc 
cagttcgtgg 
tggcacccgc 
cattatacga 
ttacaacgtc 



tgtggaaagt 60 
tcagcaacca 120 
aaccatagtc 180 
ttctccgccc 240 
ctaggaacag 300 
accagatggg 360 
accgcggctt 420 
agttataccg 480 
gtgactggga 540 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



aaaccctggc gttacccaac ttaatcgcct 
taatagcgaa gaggcccgca ccgatcgccc 
atggcgcttt gcctggtttc cggcaccaga 
tcttcctgag gccgatactg tcgtcgtccc 
5 .gcccatctac accaacgtaa cctatcccat 
gaatccgacg ggttgttact cgctcacatt 
ccagacgcga attatttttg atggcgttaa 
ctgggtcggt tacggccagg acagtcgttt 
acgcgccgga gaaaaccgcc tcgcggtgat 
10 ggaagatcag gatatgtggc ggatgagcgg 
accgactaca caaatcagcg atttccatgt 
cgctgtactg gaggctgaag ttcagatgtg 
agtttcttta tggcagggtg aaacgcaggt 
aattatcgat gagcgtggtg gttatgccga 

15 cccgaaactg tggagcgccg aaatcccgaa 
cgccgacggc acgctgattg aagcagaagc 
tgaaaatggt ctgctgctgc tgaacggcaa 
cgagcatcat cctctgcatg gtcaggtcat 
gctgatgaag cagaacaact ttaacgccgt 

20 gtggtacacg ctgtgcgacc gctacggcct 
ccacggcatg gtgccaatga atcgtctgac 
cgaacgcgta acgcgaatgg tgcagcgcga 
gctggggaat gaatcaggcc acggcgctaa 
tgtcgatcct tcccgcccgg tgcagtatga 

25 tattatttgc ccgatgtacg cgcgcgtgga 
atggtccatc aaaaaatggc tttcgctacc 
atacgcccac gcgatgggta acagtcttgg 
tcagtatccc cgtttacagg gcggcttcgt 
atatgatgaa aacggcaacc cgtggtcggc 

30 cgatcgccag ttctgtatga acggtctggt 
gacggaagca aaacaccagc agcagttttt 
agtgaccagc gaatacctgt tccgtcatag 
gctggatggt aagccgctgg caagcggtga 
acagttgatt gaactgcctg aactaccgca 

35 agtacgcgta gtgcaaccga acgcgaccgc 
gcagcagtgg cgtctggcgg aaaacctcag 
cccgcatctg accaccagcg aaatggattt 
atttaaccgc cagtcaggct ttctttcaca 
gacgccgctg cgcgatcagt tcacccgtgc 

40 agcgacccgc attgacccta acgcctgggt 
ggccgaagca gcgttgttgc agtgcacggc 
gaccgctcac gcgtggcagc atcaggggaa 
gattgatggt agtggtcaaa tggcgattac 
gcatccggcg cggattggcc tgaactgcca 

45 gctcggatta gggccgcaag aaaactatcc 
ctgggatctg ccattgtcag acatgtatac 
gcgctgcggg acgcgcgaat tgaattatgg 
caacatcagc cgctacagtc aacagcaact 
cgcggaagaa ggcacatggc tgaatatcga 

50 ctcctggagc ccgtcagtat cggcggaatt 
gttggtctgg tgtcaaaaat aataataacc 
acttctgtgg tgtgacataa ttggacaaac 
atataaaatt tttaagtgta taatgtgtta 
agattccaac ctatggaact gatgaatggg 

55 ataagataca ttgatgagtt tggacaaacc 
atttgtgaaa tttgtgatgc tattgcttta 
gttaacaaca acaattgcat tcattttatg 
ttttaaagca agtaaaacct ctacaaatgt 
ggcctcgtga tacgcctatt tttataggtt 

60 tcaggtggca cttttcgggg aaatgtgcgc 
cattcaaata tgtatccgct catgagacaa 
aaaaggaaga gtatgagtat tcaacatttc 
ttttgccttc ctgtttttgc tcacccagaa 
cagttgggtg cacgagtggg ttacatcgaa 

65 agttttcgcc ccgaagaacg ttttccaatg 
gcggtattat cccgtattga cgccgggcaa 
cagaatgact tggttgagta ctcaccagtc 



PCT7EP01/12975 

11 

tgcagcacat ccccctttcg ccagctggcg 600 
ttcccaacag ttgcgcagcc tgaatggcga 660 
agcggtgccg gaaagctggc tggagtgcga 720 
ctcaaactgg cagatgcacg gttacgatgc 780 
tacggtcaat ccgccgtttg ttcccacgga 840 
taatgttgat gaaagctggc tacaggaagg 900 
ctcggcgttt catctgtggt gcaacgggcg 960 
gccgtctgaa tttgacctga gcgcattttt 1020 
ggtgctgcgt tggagtgacg gcagttatct 1080 
cattttccgt gacgtctcgt tgctgcataa 1140 
tgccactcgc tttaatgatg atttcagccg 1200 
cggcgagttg cgtgactacc tacgggtaac 1260 
cgccagcggc accgcgcctt tcggcggtga 1320 
tcgcgtcaca ctacgtctga acgtcgaaaa 1380 
tctctatcgt gcggtggttg aactgcacac 1440 
ctgcgatgtc ggtttccgcg aggtgcggat 1500 
gccgttgctg attcgaggcg ttaaccgtca 1560 
ggatgagcag acgatggtgc aggatatcct 1620 
gcgctgttcg cattatccga accatccgct 1680 
gtatgtggtg gatgaagcca atattgaaac 1740 
cgatgatccg cgctggctac cggcgatgag 1800 
tcgtaatcac ccgagtgtga tcatctggtc 18 60 
tcacgacgcg ctgtatcgct ggatcaaatc 1920 
aggcggcgga gccgacacca cggccaccga 1980 
tgaagaccag cccttcccgg ctgtgccgaa 2040 
tggagagacg cgcccgctga tcctttgcga 2100 
cggtttcgct aaatactggc aggcgtttcg 2160 
ctgggactgg gtggatcagt cgctgattaa 2220 
ttacggcggt gattttggcg atacgccgaa 2280 
ctttgccgac cgcacgccgc atccagcgct 2340 
ccagttccgt ttatccgggc aaaccatcga 2400 
cgataacgag ctcctgcact ggatggtggc 24 60 
agtgcctctg gatgtcgctc cacaaggtaa 2520 
gccggagagc gccgggcaac tctggctcac 2580 
atggtcagaa gccgggcaca tcagcgcctg 2640 
tgtgacgctc cccgccgcgt cccacgccat 2700 
ttgcatcgag ctgggtaata agcgttggca 2760 
gatgtggatt ggcgataaaa aacaactgct 2820 
accgctggat aacgacattg gcgtaagtga 2880 
cgaacgctgg aaggcggcgg gccattacca 2940 
agatacactt gctgatgcgg tgctgattac 3000 
aaccttattt atcagccgga aaacctaccg 3060 
cgttgatgtt gaagtggcga gcgatacacc 3120 
gctggcgcag gtagcagagc gggtaaactg 3180 
cgaccgcctt actgccgcct gttttgaccg 3240 
cccgtacgtc ttcccgagcg aaaacggtct 3300 
cccacaccag tggcgcggcg acttccagtt 3360 
gatggaaacc agccatcgcc atctgctgca 3420 
cggtttccat atggggattg gtggcgacga 3480 
ccagctgagc gccggtcgct accattacca 354 0 
gggcaggggg gatctttgtg aaggaacctt 3600 
tacctacaga gatttaaagc tctaaggtaa 3660 
aactactgat tctaattgtt tgtgtatttt 3720 
agcagtggtg gaatgccaga tccagacatg 3780 
acaactagaa tgcagtgaaa aaaatgcttt 3840 
tttgtaacca ttataagctg caataaacaa 3900 
tttcaggttc agggggaggt gtgggaggtt 3960 
ggtatggctg attatgatct gcggccgcag 4020 
aatgtcatga taataatggt ttcttagacg 4080 
ggaaccccta tttgtttatt tttctaaata 4140 
taaccctgat aaatgcttca ataatattga 4200 
cgtgtcgccc ttattccctt ttttgcggca 4260 
acgctggtga aagtaaaaga tgctgaagat 4320 
ctggatctca acagcggtaa gatccttgag 4380 
atgagcactt ttaaagttct gctatgtggc 44 40 
gagcaactcg gtcgccgcat acactattct 4500 
acagaaaagc atcttacgga tggcatgaca 4560 



SUBSTITUTE SHEET (RULE 26) 
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10 



15 



20 



25 



30 



gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
gataatctca 
gtagaaaaga 
caaacaaaaa 
ctttttccga 
tagccgtagt 
ctaatcctgt 
tcaagacgat 
cagcccagct 
gaaagcgcca 
ggaacaggag 
gtcgggtttc 
agcctatgga 
tttgctcaca 
tttgagtgag 
gaggaagcgg 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgccaagc 



tatgcagtgc 
tcggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaaaact 
tgaccaaaat 
tcaaaggatc 
aaccaccgct 
aggtaactgg 
taggccacca 
taccagtggc 
agttaccgga 
tggagcgaac 
cgcttcccga 
agcgcacgag 
gccacctctg 
aaaacgccag 
tgttctttcc 
ctgataccgc 
aagagcgccc 
ggcacgacag 
agctcactca 
gaattgtgag 
tggcgcg 



tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
tcatttttaa 
cccttaacgt 
ttcttgagat 
accagcggtg 
cttcagcaga 
cttcaagaac 
tgctgccagt 
taaggcgcag 
gacctacacc 
agggagaaag 
ggagcttcca 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 



12 

atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
tttaaaagga 
gagttttcgt 
cctttttttc 
gtttgtttgc 
gcgcagatac 
tctgtagcac 
ggcgataagt 
cggtcgggct 
gaactgagat 
gcggacaggt 
gggggaaacg 
cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 



acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
tctaggtgaa 
tccactgagc 
tgcgcgtaat 
cggatcaaga 
caaatactgt 
cgcctacata 
cgtgtcttac 
gaacgggggg 
acctacagcg 
atccggtaag 
cctggtatct 
gatgctcgtc 
tcctggcctt 
tggataaccg 
agcgcagcga 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 



caacttactt 

gggggatcat 

cgacgagcgt 
tggcgaacta 
agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
gatccttttt 
gtcagacccc 
ctgctgcttg 
gctaccaact 
ccttctagtg 
cctcgctctg 
cgggttggac 
ttcgtgcaca 
tgagctatga 
cggcagggtc 
ttatagtcct 
aggggggcgg 
ttgctggcct 
tattaccgcc 
gtcagtgagc 
gccgattcat- 
caacgcaatt 
tccggctcgt 
tgaccatgat 



4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6257 
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<210> 16 
<211> 6252 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pRK64 (deltalnt) 



<400> 16 

cgtcatcacc 

ccccaggctc 

ggctccccag 

ccgcccctaa 

catggctgac 

tcgacgacac 

tgaggtggag 

aaaccgcttc 

ccatggtcgc 

ctggcgttac 

gcgaagaggc 

gctttgcctg 

ctgaggccga 

tctacaccaa 

cgacgggttg 

cgcgaattat 

tcgg'ttacgg 

ccggagaaaa 

atcaggatat 

ctacacaaat 

tactggaggc 

ctttatggca 

tcgatgagcg 

aactgtggag 

acggcacgct 

atggtctgct 



gaaacgcgcg 
cccagcaggc 
caggcagaag 
ctccgcccat 
taattttttt 
tgcagagacc 
tacgcgcccg 
tggatccata 
gagtagcttg 
ccaacttaat 
ccgcaccgat 
gtttccggca 
tactgtcgtc 
cgtaacctat 
ttactcgctc 
ttttgatggc 
ccaggacagt 
ccgcctcgcg 
gtggcggatg 
cagcgatttc 
tgaagttcag 
gggtgaaacg 
tggtggttat 
cgccgaaatc 
gattgaagca 
gctgctgaac 



aggcagctgt 
agaagtatgc 
tgtgcaaagc 
cccgccccta 
tatttatgca 
tacttcacta 
gggagcccaa 
acttcgtata 
gcactggccg 
cgccttgcag 
cgcccttccc 
ccagaagcgg 
gtcccctcaa 
cccattacgg 
acatttaatg 
gttaactcgg 
cgtttgccgt 
gtgatggtgc 
agcggcattt 
catgttgcca 
atgtgcggcg 
caggtcgcca 
gccgatcgcg 
ccgaatctct 
gaagcctgcg 
ggcaagccgt 



ggaatgtgtg 
aaagcatgca 
atgcatctca 
actccgccca 
gaggccgagg 
acaaccggta 
aggttacccc 
gcatacatta 
tcgttttaca 
cacatccccc 
aacagttgcg 
tgccggaaag 
actggcagat 
tcaatccgcc 
ttgatgaaag 
cgtttcatct 
ctgaatttga 
tgcgttggag 
tccgtgacgt 
ctcgctttaa 
agttgcgtga 
gcggcaccgc 
tcacactacg 
atcgtgcggt 
atgtcggttt 
tgctgattcg 



tcagttaggg 
tctcaattag 
attagtcagc 
gttccgccca 
ccgcctcggc 
cagttcgtgg 
agttggggca 
tacgaagtta 
acgtcgtgac 
tttcgccagc 
cagcctgaat 
ctggctggag 
gcacggttac 
gtttgttccc 
ctggctacag 
gtggtgcaac 
cctgagcgca 
tgacggcagt 
ctcgttgctg 
tgatgatttc 
ctacctacgg 
gcctttcggc 
tctgaacgtc 
ggttgaactg 
ccgcgaggtg 
aggcgttaac 



tgtggaaagt 
tcagcaacca 
aaccatagtc 
ttctccgccc 
ctaggaacag 
accagatggg 
ctactcccga 
taccgggcca 
tgggaaaacc 
tggcgtaata 
ggcgaatggc 
tgcgatcttc 
gatgcgccca 
acggagaatc 
gaaggccaga 
gggcgctggg 
tttttacgcg 
tatctggaag 
cataaaccga 
agccgcgctg 
gtaacagttt 
ggtgaaatta 
gaaaacccga 
cacaccgccg 
cggattgaaa 
cgtcacgagc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 
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at cat octet 
tgaagcagaa 
acacgctgtg 
gcatggtgcc 
gcgtaacgcg 
ggaatgaatc 
atccttcccg 
tttgeccgat 
ccatcaaaaa 
cccacgcgat 
atccccgttt 
atgaaaaegg 
gccagttctg 
aagcaaaaca 
ecagegaata 
atggtaagcc 
tgattgaact 
gcgtagtgca 
agtggcgtct 
atctgaccac 
accgccagtc 
cgctgcgcga 
cccgcattga 
aagcagcgtt 
ctcacgcgtg 
atggtagtgg 
eggegeggat 
gattagggee 
atetgecatt 
gcgggacgcg 
tcagccgcta 
aagaaggcac 
ggagcccgtc 
tctggtgtca 
tgtggtgtga 
aaatttttaa 
ccaacctatg 
atacattgat 
tgaaatttgt 
caacaacaat 
aagcaagtaa 
cgtgatacgc 
tggcactttt 
aaatatgtat 
gaagagtatg 
ccttcctgtt 
gggtgcacga 
tcgccccgaa 
attatcccgt 
tgacttggtt 
agaattatgc 
aacgategga 
tegecttgat 
cacgatgcct 
tctagcttcc 
tctgcgctcg 
tgggtctege 
tatctacacg 
aggtgectea 
gattgattta 
tctcatgacc 
aaagatcaaa 
aaaaaaacca 
tccgaaggta 
gtagttaggc 
cctgttacca 
acgatagtta 



gcatggtcag 
caactttaac 
cgaccgctac 
aatgaatcgt 
aatggtgcag 
aggccacggc 
cccggtgcag 
gtacgcgcgc 
atggctttcg 
gggtaacagt 
acagggegge 
caacccgtgg 
tatgaaeggt 
ccagcagcag 
cctgttccgt 
gctggcaagc 
gectgaacta 
accgaacgcg 
ggeggaaaac 
cagegaaatg 
aggctttctt 
tcagttcacc 
ccctaacgcc 
gttgcagtgc 
gcagcatcag 
tcaaatggcg 
tggectgaac 
gcaagaaaac 
gtcagacatg 
cgaattgaat 
cagtcaacag 
atggctgaat 
agtatcggcg 
aaaataataa 
cataattgga 
gtgtataatg 
gaactgatga 
gagtttggac 
gatgetattg 
tgcattcatt 
aacctctaca 
ctatttttat 
eggggaaatg 
ccgctcatga 
agtattcaac 
tttgctcacc 
gtgggttaca 
gaacgttttc 
attgacgecg 
gagtactcac 
agtgctgcca 
ggaccgaagg 
cgttgggaac 
gtagcaatgg 
eggcaacaat 
gcccttccgg 
ggtatcattg 
aeggggagtc 
ctgattaagc 
aaacttcatt 
aaaatccctt 
ggatcttctt 
ccgctaccag 
actggcttca 
caccacttca 
gtggctgctg 
ceggataagg 



gtcatggatg 
gccgtgcgct 
ggcctgtatg 
ctgaccgatg 
cgcgatcgta 
gctaatcacg 
tatgaaggcg 
gtggatgaag 
ctacctggag 
cttggcggtt 
ttcgtctggg 
teggcttacg 
ctggtctttg 
tttttccagt 
catagegata 
ggtgaagtgc 
ccgcagccgg 
accgcatggt 
ctcagtgtga 
gatttttgea 
tcacagatgt 
cgtgcaccgc 
tgggtcgaac 
aeggcagata 
gggaaaacct 
attaccgttg 
tgccagctgg 
tatcccgacc 
tataccccgt 
tatggcccac 
caactgatgg 
ategaeggtt 
gaattccagc 
taacegggea 
caaactacct 
tgttaaacta 
atgggagcag 
aaaccacaac 
ctttatttgt 
ttatgtttca 
aatgtggtat 
aggttaatgt 
tgegeggaac 
gacaataacc 
atttccgtgt 
cagaaacget 
tcgaactgga 
caatgatgag 
ggcaagagca 
cagtcacaga 
taaccatgag 
agctaaccgc 
eggagctgaa 
caacaaegtt 
taatagactg 
ctggctggtt 
cagcactggg 
aggcaactat 
attggtaact 
tttaatttaa 
aacgtgagtt 
gagatccttt 
cggtggtttg 
geagagegea 
agaactctgt 
ccagtggcga 
cgcagcggtc 



13 



agcagacgat 
gttegcatta 
tggtggatga 
atccgcgctg 
atcacccgag 
acgcgctgta 
geggagcega 
accagccctt 
agacgcgccc 
tegctaaata 
actgggtgga 
gcggtgattt 
ccgaccgcac 
teegtttate 
acgagctcct 
ctctggatgt 
agagegcegg 
cagaagcegg 
cgctccccgc 
tcgagctggg 
ggattggcga 
tggataacga 
gctggaaggc 
cacttgetga 
tatttatcag 
atgttgaagt 
egcaggtage 
gccttactgc 
acgtcttccc 
accagtggcg 
aaaccagcca 
tccatatggg 
tgagegcegg 
ggggggatct 
acagagattt 
ctgattctaa 
tggtggaatg 
tagaatgeag 
aaccattata 
ggttcagggg 
ggctgattat 
catgataata 
ccctatttgt 
ctgataaatg 
cgcccttatt 
ggtgaaagta 
tctcaacagc 
cacttttaaa 
actcggtcgc 
aaagcatctt 
tgataacact 
ttttttgeae 
tgaagecata 
gegcaaacta 
gatggaggcg 
tattgetgat 
gccagatggt 
ggatgaacga 
gtcagac.caa 
aaggatctag 
ttcgttccac 
ttttctgege 
tttgeeggat 
gataccaaat 
agcaccgcct 
taagtcgtgt 
gggctgaacg 



ggtgcaggat 
tccgaaccat 
agecaatatt 
gctaccggcg 
tgtgatcatc 
tegctggate 
caccacggcc 
cccggctgtg 
gctgatcctt 
ctggcaggcg 
teagtegctg 
tggcgatacg 
gccgcatcca 
cgggcaaacc 
gcactggatg 
cgctccacaa 
gcaactctgg 
gcacatcagc 
cgcgtcccac 
taataagcgt 
taaaaaacaa 
cattggcgta 
ggegggecat 
tgcggtgctg 
ccggaaaacc 
ggegagegat 
agagegggta 
cgcctgtttt 
gagegaaaac 
cggcgacttc 
tcgccatctg 
gattggtggc 
tcgctaccat 
ttgtgaagga 
aaagctctaa 
ttgtttgtgt 
ccagatccag 
tgaaaaaaat 
agetgeaata 
gaggtgtggg 
gatctgegge 
atggtttctt 
ttatttttct 
cttcaataat 
cccttttttg 
aaagatgctg 
ggtaagatcc 
gttctgetat 
cgcatacact 
aeggatggea 
gcggccaact 
aacatggggg 
ccaaacgacg 
ttaactggcg 
gataaagttg 
aaatctggag 
aagccctccc 
aatagacaga 
gtttactcat 
gtgaagatcc 
tgagcgtcag 
gtaatctget 
caagagct.ac 
actgtccttc 
acatacctcg 
ettacegggt 
gggggttcgt 
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atectgetga 1620 
ccgctgtggt 1680 
gaaacccacg 1740 
atgagegaac 1800 
tggtcgctgg 1860 
aaatctgtcg 1920 
accgatatta 1980 
ccgaaatggt 2040 
tgcgaatacg 2100 
tttcgtcagt 2160 
attaaatatg 2220 
ccgaacgatc 2280 
gegctgaegg 2340 
atcgaagtga 2400 
gtggcgctgg 24 60 
ggtaaacagt 2520 
ctcacagtac 2580 
gcctggcagc 2640 
gccatcccgc 2700 
tggcaattta 2760 
ctgctgacgc 2820 
agtgaagcga 2880 
taccaggccg 2940 
attacgaccg 3000 
taceggattg 3060 
acaccgcatc 3120 
aactggctcg 3180 
gaccgctggg 3240 
ggtctgeget 3300 
cagttcaaca 3360 
ctgcacgcgg 3420 
gacgactcct 3480 
taccagttgg 3540 
accttacttc 3600 
ggtaaatata 3660 
attttagatt 3720 
acatgataag 3780 
gctttatttg 3840 
aacaagttaa 3900 
aggtttttta 3960 
cgcagggcct 4020 
agaegtcagg 4 080 
aaatacattc 4140 
attgaaaaag 4200 
eggcattttg 4260 
aagatcagtt 4320 
ttgagagttt 4380 
gtggcgcggt 444 0 
attctcagaa 4500 
tgacagtaag 4560 
tacttctgac 4 620 
atcatgtaac 4680 
agegtgacac 4740 
aactacttac 4800 
caggaccact 4860 
ccggtgagcg 4920 
gtatcgtagt 4980 
tegctgagat 5040 
atatacttta 5100 
tttttgataa 5160 
accccgtaga 5220 
gettgeaaac 5280 
caactctttt 5340 
tagtgtagcc 5400 
etctgetaat 5460 
tggactcaag 5520 
gcacacagcc 5580 
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cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag 5640 

cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac 5700 

aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg 5760 

gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct 5820 

atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc 5880 

tcacatgttc tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga 5940 

gtgagctgat accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga 6000 

agcggaagag cgcccaatac gcaaaccgcc tctccccgcg cgttggccga ttcattaatg 6060 

cagctggcac gacaggtttc ccgactggaa agcgggcagt gagcgcaacg caattaatgt 6120 

gagttagctc actcattagg caccccaggc tttacacttt atgcttccgg ctcgtatgtt 6180 

gtgtggaatt gtgagcggat aacaatttca cacaggaaac agctatgacc atgattacgc 6240 

caagctggcg eg ' y 6252 



15 <210> 17 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
20 <220> 

<223> Description of Artificial Sequence: primer P64-1 
<400> 17 

25 tca 9caacca ggctccccag caggc 2 5 

<210> 18 
<211> 27 
<212> DNA 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer 64-4 
35 <400> 18 

gacgacagta tcggcctcag gaagatc 27 

<210> 19 
40 <211> 840 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 80d 



<400> 19 

ggtaccgagc tcggatcctc tagtaaegge cgccagtgtg ctggaattcg gcttcagcaa 60 
ccaggctccc cagcaggcag aagtatgcaa ageatgeate tcaattagtc agcaaccagg 120 
tgtggaaagt ccccaggctc cccagcaggc agaagtatgc aaagcatgea tctcaattag 180 
tcagcaacca tagtcccgcc cctaactccg cccatcccgc ccctaactcc gcccagttcc 240 
gcccattctc cgccccatgg ctgactaatt ttttttattt atgeagagge cgaggccgcc 300 
teggectagg aacagtcgac gaeactgeag agacctactt cactaacaac eggtacagtt 360 
DO cgtggaccag atgggtgagg tggagtaege geceggggag cccaaaggtt accccagttg 420 
gggcactact cccgaaaacc gcttctggat ccataacttc gtatagcata cattatacga 480 
agttataccg ggccaccatg gtcgcgagta gcttggcact ggggttgctt ttgcgnygtc 540 
gtgactggga aaaccctggc gttacccaac ttaatcgect tgcagcacat ccccctttcg 600 
ccagctggcg taatagcgaa gaggcccgca ccgatcgccc ttcccaacag ttgegcaget 660 
gaatggcgaa tggcgctttg cctggcttcc ggcaccagaa gcggtgccgg aaagctggct 720 
ggagtgcgat cttcctgagg ccgatactgt cgtcaagccg aattctgcag atatccatca 780 
cactggcggc cgctcgagca tgcatctaga gggecaatte gecctatagt gagtegtatt 840 



65 <210> 20 

<211> 1842 
<212> DNA 
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<213> Bacteriophage phi-C31 

<220> 

<221> CDS 

<222> (1) . . (1839) 



<400> 20 

atg aca caa ggg gtt gtg acc ggg gtg gac acg tac gcg ggt get tac 

Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
1.5 10 15 



48 



gac cgt cag teg cgc gag cgc gag aat teg age gca gca age cca gcg 96 

Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
20 25 30 

aca cag cgt age gee aac gaa gac aag gcg gec gac ctt cag cgc gaa 14 4 

Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
35 40 45 

gtc gag cgc gac ggg ggc egg ttc agg ttc gtc ggg cat ttc age gaa 192 

Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 
50 55 60 

gcg ccg ggc acg teg gcg ttc ggg acg gcg gag cgc ccg gag ttc gaa 240 

Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu Phe Glu 

6 5 70 75 80 



cgc ate ctg aac gaa tgc cgc gee ggg egg etc aac atg ate att gtc 288 

Arg Ile Leu Asn Glu Cys Arg Ma G1 y ^9 Leu Asn Met Ile Ile Val 

30 85 90 95 



tat gac gtg teg cgc ttc teg cgc ctg aag gtc atg gac gcg att ccg 336 
Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala Ile Pro 
100 105 110 

att gtc teg gaa ttg etc gee ctg ggc gtg acg att gtt tec act cag 384 
lie Val Ser Glu Leu Leu Ala Leu Gly Val Thr Ile Val Ser Thr Gin 
115 120 125 

gaa ggc gtc ttc egg cag gga aac gtc atg gac ctg att cac ctg att 432 
Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu Ile His Leu Ile 
130 135 140 

atg egg etc gac gcg teg cac aaa gaa tct teg ctg aag teg gcg aag 480 
Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
1^5 150 155 ' 160 



att etc gac acg aag aac ctt cag cgc gaa ttg ggc ggg tac gtc ggc 528 

Ile Leu Asp Thr Lys Asn Leu Gln Ar 9 Glu Leu G1 y G1 y T y r vai GJ -y 

50 165 170 ' 175 



ggg aag gcg cct tac ggc ttc gag ctt gtt teg gag acg aag gag ate 576 
Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu Ile 
180 185 190 

acg cgc aac ggc cga atg gtc aat gtc gtc ate aac aag ctt gcg cac 624 
Thr Arg Asn Gly Arg Met Val Asn Val Val lie Asn Lys Leu Ala His 
195 200 205 

teg acc act ccc ctt acc gga ccc ttc gag ttc gag ccc gac gta ate 672 
Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val Ile 
210 215 220 



egg tgg tgg tgg cgt gag ate aag acg cac aaa cac ctt ccc ttc aag 720 
CO Arg Trp Trp Trp Arg Glu Ile Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 
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ccg ggc agt caa gcc gcc att cac ccg ggc age ate acg ggg ctt tgt 768 
Pro Gly Ser Gin Ala Ala lie His Pro Gly Ser He Thr Gly Leu Cys 
245 250 255 

aag cgc atg gac get gac gcc gtg ccg ace egg ggc gag acg att ggg 816 
Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Gly 
260 265 270 

aag aag acc get tea age gcc tgg gac ccg gca acc gtt atg cga ate 864 
Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 
275 280 285 



ctt egg gac ccg cgt att gcg ggc ttc gcc get gag gtg ate tac aag 912 
Leu Arg Asp Pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
15 290 295 300 

aag aag ccg gac ggc acg ccg acc acg aag att gag ggt tac cgc att 960 
Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

cag cgc gac ccg ate acg etc egg ccg gtc gag ctt gat tgc gga ccg 1008 
Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 
325 330 335 

ate ate gag ccc get gag tgg tat gag ctt cag gcg tgg ttg gac ggc 1056 
He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
340 345 350 

a 99 ggg cgc ggc aag ggg ctt tec egg ggg caa gcc att ctg tec gcc 1104 
Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 365 



atg gac aag ctg tac tgc gag tgt ggc gcc gtc atg act teg aag cgc 1152 
Met Asp Lys Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arq 
35 370 375 380 



ggg gaa gaa teg ate aag gac tct tac cgc tgc cgt cgc egg aag gtg 1200 
Gly Glu Glu Ser He Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 
385 390 395 ~ 400 

gtc gac ccg tec gca cct ggg cag cac gaa ggc acg tgc aac gtc age 1248 
Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 
405 410 415 

atg gcg gca etc gac aag ttc gtt gcg gaa cgc ate ttc aac aag ate 1296 
Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg He Phe Asn Lys He 
420 425 430 

agg cac gcc gaa ggc gac gaa gag acg ttg gcg ctt ctg tgg gaa gcc 1344 
Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 



gcc cga cgc ttc ggc aag etc act gag gcg cct gag aag age ggc gaa 1392 
Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 
53 450 455 460 

egg gcg aac ctt gtt gcg gag cgc gcc gac gcc ctg aac gcc ctt gaa 14 40' 
Arg Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 
465 470 475 480 

gag ctg tac gaa gac cgc gcg gca ggc gcg tac gac gga ccc gtt ggc 14 88 
Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 
485 490 495 

agg aag cac ttc egg aag caa cag gca gcg ctg acg etc egg cag caa 1536 
Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
500 505 510 
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ggg gcg gaa gag egg ctt gec gaa ctt gaa gec gec gaa gec ccg aag 1584 
Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lvs 
515 520 525 

ctt ccc ctt gac caa tgg ttc ccc gaa gac gec gac get gac ccg acc 1632 
Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 535 - 540 

ggc cct aag teg tgg tgg ggg cgc gcg tea gta gac gac aag cgc gtg 1680 
Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 550 555 ' 560 

ttc gtc ggg etc ttc gta gac aag ate gtt gtc acg aag teg act acg 1728 
Phe Val Gly Leu Phe Val Asp Lys lie Val Val Thr Lys Ser Thr Thr 
565 570 " 575 

ggc agg ggg cag gga acg ccc ate gag aag cgc get teg ate acg tgg 1776 
Gly Arg Gly Gin Gly Thr Pro He Glu Lys Arg Ala Ser He Thr Trp 
580 585 590 

gcg aag ccg ccg acc gac gac gac gaa gac gac gee cag gac ggc acg 1824 
Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 600 ^ ~ 605 



gaa gac gta gcg gcg tag 
Glu Asp Val Ala Ala 
610 



<210> 21 
<211> 613 
<212> PRT 

<213> Bacteriophage phi-C31 



<400> 21 

Met Thr Gin Gly Val Val 
1 5 

Asp Arg Gin Ser Arg Glu 
20 

Thr Gin Arg Ser Ala Asn 
35 

Val Glu Arg Asp Gly Gly 
50 

Ala Pro Gly Thr Ser Ala 
65 70 

Arg He Leu Asn Glu Cys 
85 

Tyr Asp Val Ser Arg Phe 
100 

He Val Ser Glu Leu Leu 
115 

Glu Gly Val Phe Arg Gin 
130 

Met Arg Leu Asp Ala Ser 
145 " 150 



Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
10 15 

Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
25 30 

Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
40 45 

Arg Phe Arg Phe Val Gly His Phe Ser Glu 
55 60 

Phe Gly Thr Ala Glu Arg Pro Glu. Phe Glu 
75 80 

Arg Ala Gly Arg Leu Asn Met He He Val 
90 95 

Ser Arg Leu Lys Val Met Asp Ala' He Pro 
105 HO 

Ala Leu Gly Val Thr He Val Ser Thr Gin 
120 125 

Gly Asn Val Met Asp Leu He His Leu He 
135 140 

His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
155 160 



1842 



He Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 
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165 170 175 

Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu He 
5 180 185 190 

Thr Arg Asn Gly Arg Met Val Asn Val Val He Asn Lys Leu Ala His 
195 200 205 

Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val He 
10 210 215 220 

Arg Trp Trp Trp Arg Glu He Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 

15 Pro Gly Ser Gin Ala Ala He His Pro Gly Ser He Thr Gly Leu Cys 

245 250 255 

Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Gly 
2Q 260 265 ' 270 

Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 
275 280 285 

Leu Arg Asp Pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
25 290 295 300 

Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

30 Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 

325 330 335 

He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
35 340 345 350 

Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 365 

Met Asp Lys Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arg 
40 370 375 380 

Gly Glu Glu Ser He Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 
385 390 395 400 

45 Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 

405 410 415 

Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg He Phe Asn Lys He 
5Q 420 425 ^ 430 

Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 

Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 
55 450 455 460 

Arg Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 
465 470 475 480 

60 Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 

485 490 " 1 495 

Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
65 500 505 510 

Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 
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Leu Pro Leu Asp Gin Tip Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 535 " 540 

Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 550 555 560 

Phe Val Gly Leu Phe Val Asp Lys lie Val Val Thr Lys Ser Thr Thr 
565 570 ' 575 

Gly Arg Gly Gin Gly Thr Pro lie Glu Lys Arg Ala Ser He Thr Trp 
580 585 590 



Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
15 595 600 605 



Glu Asp Val Ala Ala 
610 



<210> 22 
<211> 1863 
<212> DNA 
25 <213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein C31-Int (CNLS) 

<220> 

<221> CDS 

<222> (1) . . {I860) 



35 <400> 22 

atg aca caa ggg gtt gtg acc ggg gtg gac acg tac gcg ggt get tac 48 
Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
1 5 10 15 

40 gac cgt cag teg cgc gag cgc gag aat teg age gca gca age cca gcg 96 
Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
20 25 30 



aca cag cgt age gec aac gaa gac aag gcg gee gac ctt cag cgc gaa 144 
Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
35 40 45 



gtc gag cgc gac ggg ggc egg ttc agg ttc gtc ggg cat ttc age gaa 192 
Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 
50 50 55 60 



gcg ccg ggc acg teg gcg ttc ggg acg gcg gag cgc ccg gag ttc gaa 240 
Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu Phe Glu 
65 70 .75 80 

cgc ate ctg aac gaa tgc cgc gee ggg egg etc aac atg ate att gtc 288 
Arg He Leu Asn Glu Cys Arg Ala Gly Arg Leu Asn Met He He Val 
85 90 95 

tat gac gtg teg cgc ttc teg cgc ctg aag gtc atg gac gcg att ccg 336 
Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala He Pro 
100 105 110 



att gtc teg gaa ttg etc gee ctg gg C gtg acg att gtt tec act cag 384 
65 He Val Ser Glu Leu Leu Ala Leu Gly Val Thr He Val Ser Thr Gin 
115 120 125 
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gaa ggc gtc ttc egg cag gga aac gtc atg gac ctg att cac ctg att 432 
Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu He His Leu lie 
130 135 140 



atg egg etc gac gcg teg cac aaa gaa tct teg ctg aag teg gcg aag 
Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
145 150 " J ~ 



155 160 



480 



528 



att etc gac acg aag aac ctt cag cgc gaa ttg ggc ggg tac gtc qqc 
He Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 
165 170 i 7 5 

ggg aag gcg cot tac ggc ttc gag ctt gtt teg gag acg aag gag ate 57 6 
Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu He 
180 185 190 

acg cgc aac ggc cga atg gtc aat gtc gtc ate aac aag ctt gcg cac 624 
Thr Arg Asn Gly Arg Met Val Asn Val Val He Asn Lys Leu Ala His 
195 200 205 

c° 9 CCC Ctt acc gga ccc ttc <?a9 ttc gag ccc gac gta ate 672 

Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val He 
210 215 220 

egg tgg tgg tgg cgt gag ate aag acg cac aaa cac ctt ccc ttc aaa 720 
Arg Trp Trp Trp Arg Glu He Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 

ccg ggc agt caa gee gec att cac ccg ggc age ate acg ggg ctt . tgt 768 
Pro Gly Ser Gin Ala Ala He His Pro Gly Ser He Thr Gly Leu Cys 
245 250 255 

aag cgc atg gac get gac gee gtg ccg acc egg ggc gag acg att ggg 816 
Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Glv 
260 265 270 

aag aag acc get tea age gee tgg gac ccg gca acc gtt atg cga ate 864 
Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 
275 280 . 285 

ctt egg gac ccg cgt att gcg ggc ttc gee get gag gtg ate tac aag 912 
Leu Arg Asp Pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
290 295 300 

aag aag ccg gac ggc acg ccg acc acg aag att gag ggt tac cgc att 960 
Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

cag cgc gac ccg ate acg etc egg ccg gtc gag ctt gat tgc gga ccg 1008 
Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 
325 330 335 

ate ate gag ccc get gag tgg tat gag ctt cag gcg tgg ttg gac ggc 1056 
lie He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
340 345 350 

agg ggg cgc ggc aag ggg ctt tec egg ggg caa gec att ctg tec gee 1104 
Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 • 365 

atg gac aag ctg tac tgc gag tgt ggc gec gtc atg act teg aag cgc 1152 
Met Asp Lys Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arg 
370 375 380 

ggg gaa gaa teg ate aag gac tct tac cgc tgc cgt cgc egg aag gtg 1200 
ttl Glu Ser Ile Lys Asp Ser ft rg Cys Arg Arg Arg Lys Val 
JUi) 390 395 ' 400 
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o C9 o CC S Ct ?? g Cag cac gaa ggc acg tgc aac gtc age 1248 

Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 
405 410 415 

£*? ?? 9 g f 3 Ct ° 9a ° aag ttc gtt gcg S aa C 9 C a tc ttc aac aag ate 1296 
Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg He Phe Asn Lys He 
420 425 ' 430 

agg cac gec gaa ggc gac gaa gag acg ttg gcg ctt ctg tgg gaa gec 1344 
Arg Has Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 

gec cga cgc ttc ggc aag etc act gag gcg cct gag aag age ggc gaa 1392 
Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 
450 455 460 



egg gcg aac ctt gtt gcg gag cgc gec gac gee ctg aac gec ctt gaa 144 0 
on ?5? Ma Asn Leu Val Ala Glu teg Ala Asp Ala Leu Asn Ala Leu Glu 
2U 465 470 475 480 

gag ctg tac gaa gac cgc gcg gca ggc gcg tac gac gga ccc gtt ggc 1488 
Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 
485 490 495 



25 

agg aag cac ttc egg aag caa cag gca gcg ctg acg etc egg cag caa 1536 
Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
500 505 510 

30 ggg g C g gaa gag egg ctt gee gaa ctt gaa gec gec gaa gec ccg aag 1584 
Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 

ctt ccc ctt gac caa tgg ttc ccc gaa gac gec gac get gac ccg ace 1632 
Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 535 " 540 

ggc cct aag teg tgg tgg ggg cgc gcg tea gta gac gac aag cgc gtg 1680 
An Pro Lys Ser Trp Tr P G1 y teq Ala Ser Val Asp Asp Lys Arg Val 

40 545 550 555 560 



lt° ?, f H 99 CtC ttc gta gac aag atc gtt 9tc acg aag teg act acg 1728 
Phe Val Gly Leu Phe Val Asp Lys He Val Val Thr Lys Ser Thr Thr 
565 570 575 

ggc agg ggg cag gga acg ccc atc gag aag cgc get teg atc acg tgg 1776 
Gly Arg Gly Gin Gly Thr Pro He Glu Lys Arg Ala Ser He Thr Trp 
580 585 590 

gcg aag ccg ccg ace gac gac gac gaa gac gac gee cag gac ggc acg 1824 
Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 600 605 

gaa gac gta gcg gcg cct aag aag aag agg aag gtt tag 1863 
Glu Asp Val Ala Ala Pro Lys Lys Lys Arg Lys Val 
610 615 * 620 



<210> 23 
60 <211> 620 
<212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein C31-Int (CNLS) 



<400> 23 

Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
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15 10 15 

Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
5 20 25 30 

Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
35 40 45 

Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 
10 50 55 60 

Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu Phe Glu 
65 70 75 80 

15 Arg He Leu Asn Glu Cys Arg Ala Gly Arg Leu Asn Met He He Val 

85 90 95 

Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala He Pro 
2Q 100 . 105 110 

He Val Ser Glu Leu Leu Ala Leu Gly Val Thr He Val Ser Thr Gin 
115 120 125 

Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu He His Leu He 
25 130 135 140 

Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
145 150 155 160 

30 lie Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 

165 170 ~ 175 

Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu He 
35 180 185 190 

Thr Arg Asn Gly Arg Met Val Asn Val Val He Asn Lys Leu Ala His 
195 200 205 

Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val He 
40 210 215 220 

Arg Trp Trp Trp Arg Glu He Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 

45 Pro Gly Ser Gin Ala Ala He His Pro Gly Ser He Thr Gly Leu Cys 

245 250 255 

Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Gly 
50 260 265 270 

Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 
.275 280 285 

Leu Arg Asp Pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
55 290 295 300 

Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

60 Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 

325 330 335 

He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
340 345 350 

65 

Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 365 
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Met 



Asp Lys Leu Tyr Cys 
370 



Glu Cys 
375 



Gly 



Ala 



Val 



Met Thr Ser Lys Arg 
380 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



Gly Glu Glu Ser lie Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 
385 390 395 400 

Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 
405 410 * 415 

Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg lie Phe Asn Lys lie 
420 425 430 

Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 

Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 
450 455 460 

Arg Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 
465 470 475 480 

Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 
485 490 495 

Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
500 505 510 

Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 

Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 535 540 

Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 550 555 560 

Phe Val Gly Leu Phe Val Asp Lys He Val Val Thr Lys Ser Thr Thr 
565 570 *' 575 

Gly Arg Gly Gin Gly Thr Pro He Glu Lys Arg Ala Ser He Thr Trp 
580 585 590 

Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 600 605 

Glu Asp Val Ala Ala Pro Lys Lys Lys Arg Lys Val 
610 615 620 



<210> 24 
<211> 43 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 24 

Met Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Cys Arg Leu Lys 
1 5 10 15 

Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu Lys 
20 25 30 

Lys Lys Lys Lys Arg Arg Arg Lys Thr Lys Arg 
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35 40 



<210> 25 
5 <211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

10 <223> Description of Artificial Sequence: NLS 
<400> 25 

He Lys Tyr Phe Lys Lys Phe Pro Lys Asp 

u i 

<210> 26 
<211> 14 
<212> PRT 
20 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 
25 <400> 26 

Met Thr Gly Ser Lys Thr Arg Lys His Arg Gly Ser Gly Ala 
15 10 



30 <210> 27 
<211> 14 
<212> PRT 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 27 

Met Thr Gly Ser Lys His Arg Lys His Pro Gly Ser Gly Ala 
40 1 5 10 



<210> 28 
<211> 7 
45 <212> PRT 

<213> Artificial Sequence 



50 



55 



60 



65 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 28 

Gly Lys Lys Arg Ser Lys Ala 
1 5 



<210> 29 

<211> 14 1 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 29 

Pro Lys Lys Ala Arg Glu Asp Val Ser Arg Lys Arg Pro Arg 
1 5 10 
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<210> -30 
<211> 11 
<212> PRT 
5 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

10 <400> 30 

Ala Pro Lys Arg Lys Ser Gly Val Ser Lys Cys 
1 5 10 

15 <210> 31 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
20 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 31 

Glu Glu Asp Gly Pro Gin Lys Lys Lys Arg Arg Leu 
25 1 5 10 

<210> 32 
<211> 8 
30 <212> PRT 

<213> Artificial Sequence 



35 



40 



45 



60 



65 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 32 

Ala Pro Thr Lys Arg Lys Gly Ser 
1 5 



<210> 33 
<211> 7 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 33 

50 Pro Asn Lys Lys Lys Arg Lys 



1 5 



<210> 34 
55 <211> 5 

<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 34 

Lys Arg Pro Arg Pro 
1 5 

<210> 35 
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<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 35 

Cys Gly Gly Leu Ser Ser Lys Arg Pro Arg Pro 
1 5 io 



PCT/EP01/12975 



<210> 36 
<211> 19 
15 <212> PRT 

<213> Artificial Sequence 

<220> 

^ <223> Description of Artificial Sequence: NLS 
<400> 36 

Pro Pro Lys Lys Arg Met Arg Arg Arg lie Glu Pro Lys Lys Lys Lys 
5 10 15 

25 Lys Arg Pro 



<210> 37 
30 <211> 11 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 37 

Pro Phe Leu Asp Arg Leu Arg Arg Asp Gin Lys 
1 5 10 



<210> 38 
<211> 9 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: NLS 

50 <400> 38 

Pro Lys Gin Lys Arg Lys Met Ala Arg 
1 5 



55 <210> 39 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
60 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 39 

Ser Val Thr Lys Lys Arg Lys Leu Glu 
65 l 5 
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<210> 40 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 40 

Cys Gly Gly Ala Ala Lys Arg Val Lys Leu Asp 
1 5 io 



<210> 41 
15 <211> 9 

<212> PRT 

<213> Artificial Sequence 
<220> 

20 <223> Description of Artificial Sequence: NLS 
<400> 41 

Pro Ala Ala Lys Arg Val Lys Leu Asp 
1 5 

25 



<210> 42 
<211> 11 
<212> PRT 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

35 <400> 42 

Arg Gin Arg Arg Asn Glu Leu Lys Arg Ser Pro 
1 5 10 

40 <210> 43 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
45 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 43 

f Pro Gln Ser ^9 L ys Lys Leu Arg 
50 l 5 



<210> 44 
<211> 8 
55 <212> PRT 

<213> Artificial Sequence 



60 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 44 

Pro Leu Leu Lys Lys 'lie Lys Gin 
1 5 



<210> 45 
<211> 7 
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<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 45 

Pro Gin Pro Lys Lys Lys Pro 
1 ^ 5 



<210> 46 
<211> 9 
<212> PRT 
15 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

20 <400> 46 

Ser Lys Arg Val Ala Lys Arg Lys Leu 

1. 5 

25 <210> 47 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
30 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 47 

Ala Ser Lys Ser Arg Lys Arg Lys Leu 
35 1 5 



<210> 48 
<211> 16 
40 <212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 48 

Gly Gly Leu Cys Ser Ala Arg Leu His Arg His Ala Leu Leu Ala Thr 
1 5 io 15 

<210> 49 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 49 

Arg Lys Thr Lys Lys Lys He Lys 
1 5 



<210> 50 
65 <211> 8 

<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: NLS 

5 <400> 50 

Arg Lys Leu Lys Lys Leu Gly Asn 
1 5 

10 <210> 51 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
15 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 51 

Arg Lys Asp Arg Arg Gly Gly Arg 
20 l 5 

<210> 52 
<211> 18 
25 <212> PRT 

<213> Artificial Sequence 

<220> 

^ <223> Description of Artificial Sequence: NLS 
<400> 52 

Asp Thr Arg Glu Lys Lys Lys Phe Leu Lys Arg Arg Leu Leu Arg Leu 
1 5 10 15 

35 Asp Glu 



<210> 53 
40 <211> 7 

<212> PRT 

<213> Artificial Sequence 



45 



50 



60 



65 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 53 

Pro Lys Lys Lys Arg Lys Val 
1 5 



<210> 54 

<211> 1410 

<212> DNA 

55 <213> Bacteriophage R4 



<220> 

<221> CDS 

<222> (1) . . (1407) 

<400> 54 

atg aat cga ggg ggg ccc act gta egg gec gac ate tac gtc cga ate 48 

Met Asn Arg Gly Gly Pro Thr Val Arg Ala Asp lie Tyr Val Arg lie 

1 5 . 10 15 

age ctg gac cgc aca ggg gaa gag etc ggg gtc gag cgc cag gag gag 96 
Ser Leu Asp Arg Thr Gly Glu Glu Leu Gly Val Glu Arg Gin Glu Glu 
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20 25 30 

teg tgt cgc gag etc tgc aag age etc ggc atg gag gtg ggg cag gtg 14 4 

Ser Cys Arg Glu Leu Cys Lys Ser Leu Gly Met Glu Val Gly Gin Val 
5 35 40 45 

tgg gtc gac aac gac ctg age gee ace aag aag aac gtc gtc cgc cct 192 

Trp Val Asp Asn Asp Leu Ser Ala Thr Lys Lys Asn Val Val Arg Pro 
50 55 60 



10 



30 



50 



gac ttc gag gcg atg ate gcg age aac ccg cag gcg ate gtc tgc tgg 24 0 
Asp Phe Glu Ala Met lie Ala Ser Asn Pro Gin Ala lie Val Cys Trp 
65 70 75 80 



15 cac ace gac egg etc ate cgc gtc acg egg gac ctg gag egg gtg ate 288 
His Thr Asp Arg Leu lie Arg Val Thr Arg Asp Leu Glu Arg Val lie 
85 90 95 

gac etc gga gtc aac gtc cac gee gtg atg gee gga cac ctg gac ctg 336 
20 Asp Leu Gly Val Asn Val His Ala Val Met Ala Gly His Leu Asp Leu 
100 105 110 

tec ace ccg gee ggc cga gee gtc gee cgc acg gtg acg gee tgg gee 384 
Ser Thr Pro Ala Gly Arg Ala Val Ala Arg Thr Val Thr Ala Trp Ala 
25 115 120 125 

acg tac gag ggc gag cag aag get gag cgc cag aag etc gee aac ate 432 
Thr Tyr Glu Gly Glu Gin Lys Ala Glu Arg Gin Lys Leu Ala Asn lie 
130 135 140 



cag aac gee cgc gee ggc aag ccg tac acc ccc ggc ate cgc ccc ttc 480 
Gin Asn Ala Arg Ala Gly Lys Pro Tyr Thr Pro Gly lie Arg Pro Phe 
145 150 155 160 



35 ggg tac ggc gac gac cac atg acc ate gtg acg gee gag gcg gac gee 528 
Gly Tyr Gly Asp Asp His Met Thr He Val Thr Ala Glu Ala Asp Ala 
165 ■ 170 175 

ate cgc gac ggc' gcg aag atg ate etc gac ggc tgg tec ctg teg gee 576 
40 He Arg Asp Gly Ala Lys Met He Leu Asp Gly Trp Ser Leu Ser Ala 
180 185 190 

gtg get cgc tac tgg gag gag etc aag etc cag teg ccc egg agt atg 624 
Val Ala Arg Tyr Trp Glu Glu Leu Lys Leu Gin Ser Pro Arg Ser Met 
45 195 200 205 

gee gca ggc ggc aag ggc tgg tct ctg egg ggc gta aag aag gtg ctg 672 
Ala Ala Gly Gly Lys Gly Trp Ser Leu Arg Gly Val Lys Lys Val- Leu 
210 215 220 



acc tec ccg cgc tac gtc ggg egg tec age tac etc ggg gag gtc gtg 720 
Thr Ser Pro Arg Tyr Val Gly Arg Ser Ser Tyr Leu Gly Glu Val Val 
225 230 235 240 



55 ggc gat get cag tgg ccg ccc ate etc gac ccg gac gtc tac tac ggg 7'68 
Gly Asp Ala Gin Trp Pro Pro He Leu Asp Pro Asp Val Tyr Tyr Gly 
245 250 255 

gtc gtg gee ate ctg aac aac ccc gac cgc ttc age ggg ggc cct egg 816 
60 Val Val Ala He Leu Asn Asn Pro Asp Arg Phe Ser Gly Gly Pro Arg 
260 265 270 

acc ggc cgc acc ccc ggc acg ctg etc gca ggc ate gec ttg tgc ggt 864 
Thr Gly Arg Thr Pro Gly Thr Leu Leu Ala Gly He Ala Leu Cys Gly 
65 275 280 285 

gag tgc ggc aag acg gtc agt gga cgc ggc tac cga ggt gtc ctg gtc 912 
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Glu Cys Gly Lys Thr Val Ser Gly Arg Gly Tyr Arg Gly Val Leu Val 
290 295 300 

tac gga tgt aag gac acg cac act egg acg cct egg age ate get gac 960 
Tyr Gly Cys Lys Asp Thr His Thr Arg Thr Pro Arg Ser He Ala Asp 
305 310 315 320 

ggc cgc gcg age age teg ace etc gee egg etc atg ttc ccc gac ttc 1008 
Gly Arg Ala Ser Ser Ser Thr Leu Ala Arg Leu Met Phe Pro Asp Phe 
325 330 335 

ctg ccc ggc etc ctg gee tct ggg cag gee gag gac ggc cag teg gca 1056 
Leu Pro Gly Leu Leu Ala Ser Gly Gin Ala Glu Asp Gly Gin Ser Ala 
340 345 350 

gca tec aag cac teg gag gee cag acg ctg cgc gag cgc ctt gac ggg 1104 
Ala Ser Lys His Ser Glu Ala Gin Thr Leu Arg Glu Arg Leu Asp Gly 
355 ,360 365 

ctg get acg gee tac gcg gag ggt gcg ate age ctg tct cag atg acg 1152 
Leu Ala Thr Ala Tyr Ala Glu Gly Ala He Ser Leu Ser Gin Met Thr 
370 375 380 

gee ggc teg gaa gca ctg egg aag aag ctg gag gtg ate gaa gee gac 1200 
Ala Gly Ser Glu Ala Leu Arg Lys Lys Leu Glu Val He Glu Ala Asp 
385 390 395 400 

etc gtg ggc teg gca ggc ate ccg ccc ttc gat cca gtg gee gga gtg 124 8 
Leu Val Gly Ser Ala Gly He Pro Pro Phe Asp Pro Val Ala Gly Val 
405 410 415 

get ggc ctg ate tec ggc tgg ccc ace acg cct etc ccg acg cgt cga 1296 
Ala Gly Leu He Ser Gly Trp Pro Thr Thr Pro Leu Pro Thr Arg Arg 
420 425 430 

gca tgg gtg gac ttc tgc ctg gtg gtc acg ctg aac acc cag aag ggg 1344 
Ala Trp Val Asp The Cys Leu Val Val Thr Leu Asn Thr Gin Lys Gly 
435 440 445 

cgc cat gcg teg age atg acc gtg gac gac cac gtc acc ate gag tgg 1392 
Arg His Ala Ser Ser Met Thr Val Asp Asp His Val Thr He Glu Trp 
450 455 460 

cga gac gtg gee gag tag 1410 

Arg Asp Val Ala Glu 

465 

<210> 55 
<211> 469 
<212> PRT 

<213> Bacteriophage R4 
<400> 55 

Met Asn Arg Gly Gly Pro Thr Val Arg Ala Asp He Tyr Val Arg He 
1 5 10 15 

Ser Leu Asp Arg Thr Gly Glu Glu Leu Gly Val Glu Arg Gin Glu Glu 
20 25 30 

Ser Cys Arg Glu Leu Cys Lys Ser Leu Gly Met Glu Val Gly Gin Val 
35 40 . 45 

Trp Val Asp Asn Asp Leu Ser Ala Thr Lys Lys Asn Val Val Arg Pro 
50 55 ^60 

Asp Phe Glu Ala Met He Ala Ser Asn Pro Gin Ala He Val Cys Trp 
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65 



70 



75 



80 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



His Thr Asp Arg Leu He Arg Val Thr Arg Asp Leu Glu Arg Val He 
85 90 95 

Asp Leu Gly Val Asn Val His Ala Val Met Ala Gly His Leu Asp Leu 
100 105 - HO 

Ser Thr Pro Ala Gly Arg Ala Val Ala Arg Thr Val Thr Ala Trp Ala 
115 120 125 

Thr Tyr Glu Gly Glu Gin Lys Ala Glu Arg Gin Lys Leu Ala Asn He 
130 135 140 

Gin Asn Ala Arg Ala Gly Lys Pro Tyr Thr Pro Gly He Arg Pro Phe 
145 150 155 i ~ 160 

Gly Tyr Gly Asp Asp His Met Thr He Val Thr Ala Glu Ala Asp Ala 
165 170 175 

He Arg Asp Gly Ala Lys Met He Leu Asp Gly Trp Ser Leu Ser Ala 
180 185 190 

Val Ala Arg Tyr Trp Glu Glu Leu Lys Leu Gin Ser Pro Arg Ser Met 
195 200 205 

Ala Ala Gly Gly Lys Gly Trp Ser Leu Arg Gly Val Lys Lys Val Leu 
210 215 220 

Thr Ser Pro Arg Tyr Val Gly Arg Ser Ser Tyr Leu Gly Glu Val Val 
225 230 235 240 

Gly Asp Ala Gin Trp Pro Pro He Leu Asp Pro Asp Val Tyr Tyr Gly 
245 250 255 

Val Val Ala He Leu Asn Asn Pro Asp Arg Phe Ser Gly Gly Pro Arg 
260 265 270 

Thr Gly Arg Thr Pro Gly Thr Leu Leu Ala Gly He Ala Leu Cys Gly 
275 280 285 

Glu Cys Gly Lys Thr Val Ser Gly Arg Gly Tyr Arg Gly Val Leu Val 
290 295 300 

Tyr Gly Cys Lys Asp Thr His Thr Arg Thr Pro Arg Ser He Ala Asp 
305 310 315 " 320 

Gly Arg Ala Ser Ser Ser Thr Leu Ala Arg Leu Met Phe Pro Asp Phe 
325 330 335 

Leu Pro Gly Leu Leu Ala Ser Gly Gin Ala Glu Asp Gly Gin Ser Ala 
340 345 350 

Ala Ser Lys His Ser Glu Ala Gin Thr Leu Arg Glu Arg Leu Asp Gly 
355 360 365 

Leu Ala Thr Ala Tyr Ala Glu Gly Ala He Ser Leu Ser Gin Met Thr 
370 375 380 

Ala Gly Ser Glu Ala Leu Arg Lys Lys Leu Glu Val He Glu Ala Asp 
385 390 395 40 0 

Leu Val Gly Ser Ala Gly He Pro Pro Phe Asp Pro Val Ala Gly Val 
4 °5 410 415 

Ala Gly Leu He Ser Gly Trp Pro Thr Thr Pro Leu Pro Thr Arg Arg 



420 



425 



430 
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Ala Trp Val Asp Phe Cys Leu Val Val Thr Leu Asn Thr Gin Lys Gly 
435 440 445 

Arg His Ala Ser Ser Met Thr Val Asp Asp His Val Thr He Glu Trp 
450 455 460 

Arg Asp Val Ala Glu 
465 



<210> 56 
<211> 1503 
15 <212> DNA 

<213> CisA recombinase 

<220> 
<221> CDS 
20 <222> (1)..{1500) 



<400> 56 

gtg ata gca at a tat gta agg gta teg acc gag gaa caa gcg ate aag 48 

Val He Ala He Tyr Val Arg Val Ser Thr Glu Glu Gin Ala He Lys 
15 10 15 

gga teg age ate gac age caa ate gag gee tgt ata aag aaa gca ggg 96 
Gly Ser Ser lie Asp Ser Gin lie Glu Ala Cys He Lys Lys Ala Gly 
20 25 30 

act aaa gat gtg ctg aag tat gca gat gaa gga ttt tea gga gag ctt 144 
Thr Lys Asp Val Leu Lys Tyr Ala Asp Glu Gly Phe Ser Gly Glu Leu 
35 40 * 45 

tta gaa cgt ccg get ttg aat cgc ttg agg gag gat gca age aag gga 192 
Leu Glu Arg Pro Ala Leu Asn Arg Leu Arg Glu Asp Ala Ser Lys Glv 
50 55 60 

ctt ata agt caa gtc att tgt tac gat cct gac cgt ctt tct egg aaa 240 
Leu He Ser Gin Val He Cys Tyr Asp Pro Asp Arg Leu Ser Arg Lys 
65 70 75 80 

tta atg aat cag eta ate att gat gac gaa ttg cga aag cga aac ata 288 
Leu Met Asn Gin Leu He lie Asp Asp Glu Leu Arg Lys Arg Asn He 
85 90 * 95 

cct ttg att ttt gta aat ggt gaa tac gee aat tct cca gaa ggt caa 336 
Pro Leu He Phe Val Asn Gly Glu Tyr Ala Asn Ser Pro Glu Gly Gin 
100 105 no 

ttg ttt ttc gca atg cgc ggg gca ate tea gaa ttt gaa aaa gee aaa 384 
Leu Phe Phe Ala Met Arg Gly Ala lie Ser Glu Phe Glu Lys Ala Lys 
115 120 125 

ate aaa gaa egg aca tea age ggc cga ctt caa aaa atg aaa aaa ggc 432 
He Lys Glu Arg Thr Ser Ser Gly Arg Leu Gin Lys Met Lys Lys Gly 
130 135 140 

atg ate att aaa gat tct aaa eta tat ggc tat aaa ttt gtt aaa gag 480 
Met lie lie Lys Asp Ser Lys Leu Tyr Gly Tyr Lys Phe Val Lys Glu 
145 150 155 160 



aaa aga act ctt gag ata tta gaa gag gaa gca aaa ate att egg atg 528 

Lys Arg Thr Leu Glu lie Leu Glu Glu Glu Ala Lys He He Arg Met 
165 170 " 175 

att ttt aac tat ttc acc gat cat aaa age cct ttt ttc ggc aga gta 57 6 
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lie Phe Asn Tyr Phe Thr Asp His Lys Ser Pro Phe Phe Gly Arg Val 
180 185 190 

aat ggt att get eta cat tta act cag atg ggg gtt aaa aca aaa aaa 624 
Asn Gly He Ala Leu His Leu Thr Gin Met Gly Val Lys Thr Lys Lys 
195 200 205 

ggc gec aaa gta tgg cac agg cag gtt gtt egg caa ata tta atg aac 672 
Gly Ala Lys Val Trp His Arg Gin Val Val Arg Gin He Leu Met Asn 
210 215 . 220 

tct tec tat aag ggt gaa cat aga cag tat aaa tat gat aca gag ggt 720 
Ser Ser Tyr Lys Gly Glu His Arg Gin Tyr Lys Tyr Asp Thr Glu Gly 
225 230 235 240 

tec tat gtt tea aag cag gca ggg aac aaa tct ata att aaa ata agg 768 
Ser Tyr Val Ser Lys Gin Ala Gly Asn Lys Ser He He Lys He Arg 
245 250 255 

cct gaa gaa gaa caa ate act gtg aca att cca gca att gtt cca get 816 
Pro Glu Glu Glu Gin He Thr Val Thr He Pro Ala He Val Pro Ala 
260 265 270 

gaa caa tgg gat tat get caa gaa etc tta ggt caa agt aaa aga aaa 864 
Glu Gin Trp Asp Tyr Ala Gin Glu Leu Leu Gly Gin Ser Lys Arg Lys 
275 280 285 

cac ttg agt ate age cct cac aat tac ttg tta teg ggt ttg gtt aga 912 
His Leu Ser He Ser Pro His Asn Tyr Leu Leu Ser Gly Leu Val Arg 
290 295 300 

tgc gga aaa tgc gga aat ace atg aca ggg aag aaa aga aaa tea cat 960 
Cys Gly Lys Cys Gly Asn Thr Met Thr Gly Lys Lys Arg Lys Ser His 
305 310 315 320 

ggt aaa gac tac tat gta tat act tgc egg aaa aat tat tct ggc gca 1008 
Gly Lys Asp Tyr Tyr Val Tyr Thr Cys Arg Lys Asn Tyr Ser Gly Ala 
325 330 335 

aag gac cgc ggc tgc gga aaa gaa atg tct gag aat aaa ttg aac egg 1056 
Lys Asp Arg Gly Cys Gly Lys Glu Met Ser Glu Asn Lys Leu Asn Arg 
340 345 350 

cat gta tgg ggt gaa att ttt aaa ttc ate aca aat cct caa aag tat 1104 
His Val Trp Gly Glu He Phe Lys Phe He Thr Asn Pro Gin Lys Tyr 
355 360 365 

gtt tct ttt aaa gag get gaa caa tea aat cac ctg tct gat gaa tta 1152 
Val Ser Phe Lys Glu Ala Glu Gin Ser Asn His Leu Ser Asp Glu Leu 
370 375 380 



gaa ctt att gaa aaa gag ata gag aaa aca aaa aaa ggc cgc aag cgt 
Glu Leu He Glu Lys Glu He Glu Lys Thr Lys Lys Gly Arg Lys Arg 
385 390 395 ~ 400 



1200 



ctt tta acg eta ate age eta age gat gac gat gat tta gac ata gat 124 8 

Leu Leu Thr Leu He Ser Leu Ser Asp Asp Asp Asp Leu Asp He Asp 
405 410 415 

gaa ate aaa gca caa att att gaa ctg caa aaa aag caa aat cag ctt 1296 

Glu He Lys Ala Gin He He Glu Leu Gin Lys Lys Gin Asn Gin Leu 
4 20 4 25 430 

act gaa aag tgt aac aga ate cag tea aaa atg aaa gtc eta gat gat 134 4 

Thr Glu Lys Cys Asn Arg He Gin Ser Lys Met Lys Val Leu Asp Asp 

435 440 445 
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acg age tea agt gaa aat get eta aaa aga gee ate gac tat ttt caa 1392 

Thr Ser Ser Ser Glu Asn Ala Leu Lys Arg Ala lie Asp Tyr Phe Gin 
450 455 " 460 

5 tea ate ggt gca gat aac tta act ctt gaa gat aaa aaa aca att gtt 1440 

Ser He Gly Ala Asp Asn Leu Thr Leu Glu Asp Lys Lys Thr He Val 
465 470 475 480 

aac ttt ate gtg aaa gaa gtt ace att gtg gat tct gac acc ata tat 1488 

1U Asn Phe He Val Lys Glu Val Thr He Val Asp Ser Asp Thr He Tyr 

485 490 495 



att gaa acg tat taa 
He Glu Thr Tyr 
15 500 



<210> 57 
<211> 500 
20 <212> PRT 

<213> CisA recombinase 



25 



<400> 57 

Val lie Ala He Tyr Val Arg Val Ser Thr Glu Glu Gin Ala lie Lys 
1 5 10 15 

Gly Ser Ser lie Asp Ser Gin lie Glu Ala Cys He Lys Lys Ala Gly 
20 25 30 

30 Thr Lys Asp Val Leu Lys Tyr Ala Asp Glu Gly Phe Ser Gly Glu Leu 
35 40 45 



35 



50 



55 



65 



Leu Glu Arg Pro Ala Leu Asn Arg Leu Arg Glu Asp Ala Ser Lys Gly 
50 55 60 

Leu He Ser Gin Val He Cys Tyr Asp Pro Asp Arg Leu Ser Arg Lys 

6 5 70 .75 80 



Leu Met Asn Gin Leu lie lie Asp Asp Glu Leu Arg Lys Arg Asn He 
40 85 90 95 

Pro Leu lie Phe Val Asn Gly Glu Tyr Ala Asn Ser Pro Glu Gly Gin 
100 105 no 

45 Leu Phe Phe Ala Met Arg Gly Ala lie Ser Glu Phe Glu Lys Ala Lys 
115 120 125 



He Lys Glu Arg Thr Ser Ser Gly Arg Leu Gin Lys Met Lys Lys Glv 
130 135 140 

Met lie lie Lys Asp Ser Lys Leu Tyr Gly Tyr Lys Phe Val Lys Glu 
145 150 155 160 

Lys Arg Thr Leu Glu He Leu Glu Glu Glu Ala Lys lie lie Arg Met 
165 170 175 

He Phe Asn Tyr Phe Thr Asp His Lys Ser Pro Phe Phe Gly Arg Val 
180 185 190 

60 Asn Gly He Ala Leu His Leu Thr Gin Met Gly Val Lys Thr Lys Lys 
195 200 205 



Gly Ala Lys Val Trp His Arg Gin Val Val Arg Gin He Leu Met Asn 

210 215 220 

Ser Ser Tyr Lys Gly Glu His Arg Gin Tyr Lys Tyr Asp Thr Glu Gly 

225 230 235 240 



1503 
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Ser Tyr Val Ser Lys Gin Ala Gly Asn Lys Ser He He Lys He Arg 
245 250 255 

Pro Glu Glu Glu Gin He Thr Val Thr He Pro Ala lie Val Pro Ala 
260 265 270 

Glu Gin Trp Asp Tyr Ala Gin Glu Leu Leu Gly Gin Ser Lys Arq Lys 
275 280 285 

His Leu Ser He Ser Pro His Asn Tyr Leu Leu Ser Gly Leu Val Arq 

290 9QR r>nr\ * 



Cys Gly Lys Cys Gly Asn Thr Met Thr Gly Lys Lys Arg Lys Ser His 
15 305 310 ^ 



290 295 300 

Lys Lys 

315 " 320 

Gly Lys Asp Tyr Tyr Val Tyr Thr Cys Arg Lys Asn Tyr Ser Gly Ala 
325 330 335 

20 Lys Asp Arg Gly Cys Gly Lys Glu Met Ser Glu Asn Lys Leu Asn Arg 
340 345 350 

His Val Trp Gly Glu lie Phe Lys Phe He Thr Asn Pro Gin Lys Tyr 
355 360 365 

Val Ser Phe Lys Glu Ala Glu Gin Ser Asn His Leu Ser Asp Glu Leu 
370 375 380 

Glu Leu He Glu Lys Glu He Glu Lys Thr Lys Lys Gly Arg Lys Arq 
30 385 390 ~ 395 " 400 

Leu Leu Thr Leu He Ser Leu Ser Asp Asp Asp Asp Leu Asp He Asp 
405 410 415 

35 Glu He Lys Ala Gin He lie Glu Leu Gin Lys Lys Gin Asn Gin Leu 
420 425 430 

Thr Glu Lys Cys Asn Arg He Gin Ser Lys Met Lys Val Leu Asp Asp 
435 440 ~ 445 

Thr Ser Ser Ser Glu Asn Ala Leu Lys Arg Ala He Asp Tyr Phe Gin 
450 455 . 460 

Ser lie Gly Ala Asp Asn Leu Thr Leu Glu Asp Lys Lys Thr lie Val 
45 465 470 475 480 

Asn Phe He Val Lys Glu Val Thr lie Val Asp Ser Asp Thr He Tyr 
485 490 495 



50 lie Glu Thr Tyr 
500 



55 <210> 58 

<211> 1545 
<212> DNA 

<213> XisF recombinase 

60 <220> 

<221> CDS 

<222> (1) . . (1542) 



<400> 58 

65 atg gaa aat tgg ggt tac gcg aga gtg age ggt gag gaa cag caa aca 48 

Met Glu Asn Trp Gly Tyr Ala Arg Val Ser Gly Glu Glu Gin Gin Thr 
15 io 15 
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gat aaa ggt gcg ttg cgt aaa caa ata gaa cgc ttg cgt aat get gga 96 
Asp Lys Gly Ala Leu Arg Lys Gin He Glu Arg Leu Arg Asn Ala Gly 
20 25 30 

tgt tea aaa gtg tac tgg gat att caa teg egg aca act gaa gtc aga 144 
Cys Ser Lys Val Tyr Trp Asp He Gin Ser Arg Thr Thr Glu Val Arg 
35 40 45 

gaa ggg eta caa caa tta att aat gac tta aag aca tct tea aca ggt 192 
Glu Gly Leu Gin Gin Leu He Asn Asp Leu Lys Thr Ser Ser Thr Gly 
50 55 60 

aag gta aaa tea ctg caa ttt ace cgc att gat cgc ate ggc tea tea 240 
Lys Val Lys Ser Leu Gin Phe Thr Arg He Asp Arg He Gly Ser Ser 
65 70 75 80 

teg egg ttg ttt tat tea ttg tta gag gta tta cgt tec aag gga att 288 
Ser Arg Leu Phe Tyr Ser Leu Leu Glu Val Leu Arg Ser Lys Gly He 
85 90 95 

aaa ctg ata gee tta gat caa ggc gtt gac cca gac age ctt ggc ggg 336 
Lys Leu He Ala Leu Asp Gin Gly Val Asp Pro Asp Ser Leu Gly Gly 
100 105 110 

gaa eta aca att gat atg tta ctg get get gee aaa ttt gag gta aga 384 
Glu Leu Thr He Asp Met Leu Leu Ala Ala Ala Lys Phe Glu Val Arg 
115 120 125 

atg gtg acg gag agg tta aaa age gaa cgt cgt cat agg gtg aac caa 432 
Met Val Thr Glu Arg Leu Lys Ser Glu Arg Arg His Arg Val Asn Gin 
130 135 140 

gga aaa agt cac cga gtt gee cca tta gga tac cgc aaa gat aaa gat 4 80 
Gly Lys Ser His Arg Val Ala Pro Leu Gly Tyr Arg Lys Asp Lys Asp 
145 150 155 160 

aaa tat ata cgc gat cgc tea cca tgt gtt tgc tta eta gaa gga cgc 528 
Lys Tyr He Arg Asp Arg Ser Pro Cys Val Cys Leu Leu Glu Gly Arg 
165 170 175 

aga gaa tta acg gtg tct gac tta gee cag tat att ttt cac act ttt 576 
Arg Glu Leu Thr Val Ser Asp Leu Ala Gin Tyr He Phe His Thr Phe 
180 185 190 

ttt gag tgc ggt tec gtt get get act gtg cgt aag ctg cac tea gat 624 
Phe Glu Cys Gly Ser Val Ala Ala Thr Val Arg Lys Leu His Ser Asp 
195 200 205 

ttt ggt ata gaa aca aaa gtt ctg aat tgg aac aag eta gaa aaa tct 672 
Phe Gly He Glu Thr Lys Val Leu Asn Trp Asn Lys Leu Glu Lys Ser 
210 215 220 

tec egg att gtt ggc gac gac gac tta gat. aaa att gca ttt aca cca 720 
Ser Arg lie Val Gly Asp Asp Asp Leu Asp Lys He Ala Phe Thr Pro 
225 230 235 240 

aat aaa act aac cac ccc ttg cgt tat ccc tgg tct ggg eta aga tgg 768 
Asn Lys Thr Asn His Pro Leu Arg Tyr Pro Trp Ser Gly Leu Arg Trp 
245 250 255 

tea ate cct ggt tta aaa gcg tta tta gtt aac cct gtt tac gec ggg 816 
Ser He Pro Gly Leu Lys Ala Leu Leu Val Asn Pro Val Tyr Ala Gly 
260 265 270 

ggt ttg ccc ttt gat act tac gtt aaa tea aaa gga aaa cgc aag cat 8 64 
Gly Leu Pro Phe Asp Thr Tyr Val Lys Ser Lys Gly Lys Arg Lys His 
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275 280 285 

ttt gac gag tgg aaa gta aaa tgg gga acc cac gac gat gag gca ate 912 
Phe Asp Glu Trp Lys Val Lys Trp Gly Thr His Asp Asp Glu Ala He 
5 290 295 300 



10 



30 



50 



att acc tgt gag gaa cat gaa aga ata aaa cag atg att cga gac aat 960 

He Thr Cys Glu Glu His Glu Arg He Lys Gin Met He Arg Asp Asn 
305 310 315 320 

cgc aat aat cga tgg get gca aga gaa gaa aac gaa gta aac cca ttt 1008 

Arg Asn Asn Arg Trp Ala Ala Arg Glu Glu Asn Glu Val Asn Pro Phe 
325 330 335 



15 tct aat tta ctt aaa tgt acc cat tgc ggc ggc tea atg aca cgc cac 1056 

Ser Asn Leu Leu Lys Cys Thr His Cys Gly Gly Ser Met Thr Arg His 

340 345 350 

gee aaa cgt gta gat aag agt gga caa get ate tat tat tat cag tgc 1104 

20 Ala Lys Arg Val Asp Lys Ser Gly Gin Ala He Tyr Tyr Tyr Gin Cys 

355 360 365 

cga ttg tat aaa get ggc aac tgt age aat aaa aat atg att tea tec 1152 

Arg Leu Tyr Lys Ala Gly Asn Cys Ser Asn Lys Asn Met He Ser Ser 

25 370 375 380 

aaa ata tta gat ate caa gta atg gat tta ttg gca caa gaa gee gaa 1,200 

Lys lie Leu Asp He Gin Val Met Asp Leu Leu Ala Gin Glu Ala Glu 

385 390 395 400 



cgt tta gca aat ttg gtg gaa aca gat gag ccg ctt att gta gaa gaa 1248 
Arg Leu Ala Asn Leu Val Glu Thr Asp Glu Pro Leu lie Val Glu Glu 
405 410 415 



35 ccc cca, gaa gta aaa acg ctg cgc gca tec ctg aat agt ctg gaa aca 1296 
Pro Pro Glu Val Lys Thr Leu Arg Ala Ser Leu Asn Ser Leu Glu Thr 
420 425 430 

ttg cca gca agt tea gca att gaa caa att aaa aat gac etc aaa gaa 134 4 
40 Leu Pro Ala Ser Ser Ala lie Glu Gin lie Lys Asn Asp Leu Lys Glu 
435 440 ■ 445 . 

cag att gcg ate gca eta gga gca acc aat aat get tct aaa caa tct 1392 
Gin lie Ala lie Ala Leu Gly Ala Thr Asn Asn Ala Ser Lys Gin Ser 
45 450 455 460 

ctg att gee aag gaa aga att ata caa get ttt get cat aaa agt tac 1440 
Leu lie Ala Lys Glu Arg He He Gin Ala Phe Ala His Lys Ser Tyr 
465 470 475 480 



tgg caa gga eta aac get caa gat aaa cga gca ate etc aat ggt tgc 1488 
Trp Gin Gly Leu Asn Ala Gin Asp Lys Arg Ala lie Leu Asn Gly Cys 
485 490 495 



55 gta aaa aaa ate tec gta gat ggt aac ttt gtt aca get att gag tat 1536 
Val Lys Lys He Ser Val Asp Gly Asn Phe Val Thr Ala lie Glu Tyr 
500 505 510 

cgt tac tag 1545 
60 Arg Tyr 

<210> 59 
<211> 514 
65 <212> PRT 

<213> XisF recombinase 
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<400> 59 

Met Glu Asn Trp Gly Tyr Ala Arg Val Ser Gly Glu Glu Gin Gin Thr 
1 5 io 15 

Asp Lys Gly Ala Leu Arg Lys Gin He Glu Arg Leu Arg Asn Ala Gly 
20 25 ~ 30 

Cys Ser Lys Val Tyr Trp Asp He Gin Ser Arg Thr Thr Glu Val Arg 
35 40 45 

Glu Gly Leu Gin Gin Leu He Asn Asp Leu Lys Thr Ser Ser Thr Glv 
50 55 1 60 

Lys Val Lys Ser Leu Gin Phe Thr Arg He Asp Arg He Gly Ser Ser 
65 ™ 75 ~ 80 

Ser Arg Leu Phe Tyr Ser Leu Leu Glu Val Leu Arg Ser Lys Gly He 
85 90 95 

Lys Leu He Ala Leu Asp Gin Gly Val Asp Pro Asp Ser Leu Gly Gly 
100 105 * HO 

Glu Leu Thr He Asp Met Leu Leu Ala Ala Ala Lys Phe Glu Val Arg 
115 120 125 

Met Val Thr Glu Arg Leu Lys Ser Glu Arg Arg His Arg Val Asn Gin 
130 135 ' 140 

. Gly Lys Ser His Arg Val Ala Pro Leu Gly Tyr Arg Lys Asp Lys Asp 
145 150 155 160 

Lys Tyr He Arg Asp Arg Ser Pro Cys Val Cys Leu Leu Glu Gly Arg 
165 170 175 

Arg Glu Leu Thr Val Ser Asp Leu Ala Gin Tyr He Phe His Thr Phe 
180 185 190 

Phe Glu Cys Gly Ser Val Ala Ala Thr Val Arg Lys Leu His Ser Asp 
195 200 205 

Phe Gly He Glu Thr Lys Val Leu Asn Trp Asn Lys Leu Glu Lys Ser 
210 215 220 

Ser Arg He Val Gly Asp Asp Asp Leu Asp Lys He Ala Phe Thr Pro 
22 5 230 235 240 

Asn Lys Thr Asn His Pro Leu Arg Tyr Pro Trp Ser Gly Leu Arg Trp 
245 250 255 

Ser He Pro Gly Leu Lys Ala Leu Leu Val Asn Pro Val Tyr Ala Gly 
260 265 270 

Gly Leu Pro Phe Asp Thr Tyr Val Lys Ser Lys Gly Lys Arg Lys His 
275 280 285 

Phe Asp Glu Trp Lys Val Lys Trp Gly Thr His Asp Asp Glu Ala He 
290 295 300 

He Thr Cys Glu Glu His Glu Arg He Lys Gin Met He Arg Asp Asn 
305 310 315 320 

Arg Asn Asn Arg Trp Ala Ala Arg Glu Glu Asn Glu Val Asn Pro Phe 
325 330 335 

Ser Asn Leu Leu Lys Cys Thr His Cys Gly Gly Ser Met Thr Arg His 
340 345 " 350 
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Ala Lys Arg Val Asp Lys Ser Gly Gin Ala lie Tyr Tyr Tyr Gin Cvs 
355 360 365 

Arg Leu Tyr Lys Ala Gly Asn Cys Ser Asn Lys Asn Met lie Ser Ser 
5 370 375 380 

Lys lie Leu Asp He Gin Val Met Asp Leu Leu Ala Gin Glu Ala Glu 
385 390 395 400 

10 Arg Leu Ala Asn Leu Val Glu Thr Asp Glu Pro Leu He Val Glu Glu 

405 410 415 



15 



30 



35 



40 



65 



Pro Pro Glu Val Lys Thr Leu Arg Ala Ser Leu Asn Ser Leu Glu Thr 
420 425 430 

Leu Pro Ala Ser Ser Ala He Glu Gin He Lys Asn Asp Leu Lys Glu 
435 440 445 



Gin He Ala He Ala Leu Gly Ala Thr Asn Asn Ala Ser Lys Gin Ser 
20 450 455 460 

Leu He Ala Lys Glu Arg He He Gin Ala Phe Ala His Lys Ser Tyr 
4g 5 470 475 480 

25 Trp Gin Gly Leu Asn Ala Gin Asp Lys Arg Ala lie Leu Asn Gly Cys 

485 490 495 



Val Lys Lys He Ser Val Asp Gly Asn Phe Val Thr Ala He Glu Tyr 
500 505 510 

Arg Tyr 



<210> 60 
<211> 2124 
<212> DNA 

<213> Transposon Tn4451 

<220> 

<221> CDS 

<222> (1) . . (2121) 



45 <400> 60 

atg tea agg act tea aga att aca gca ctt tac gag cgt ttg tea aga 48 

Met Ser Arg Thr Ser Arg He Thr Ala Leu Tyr Glu Arg Leu Ser Arg 

1 5 10 15 

50 gat gat gac ctt act ggc gag agt aat tct att acc aat caa aag aaa 96 

Asp Asp Asp Leu Thr Gly Glu Ser Asn Ser He Thr Asn Gin Lys Lys 

20 25 30 

tac etc gaa gat tat gec cgt agg aat ggt ttt gag aac att cgc cat 14 4 

00 Tyr Leu Glu Asp Tyr Ala Arg Arg Asn Gly Phe Glu Asn He Arg His 

35 40 45 

ttt acc gat gac gga ttt teg ggt gta aat ttc aat cgc cct ggc ttt 192 

, A Phe Thr As P As P G1 Y phe Ser Gl y Val Asn P h e Asn Arg Pro Gly Phe 

60 50 55 60 



caa tct ctg ata aaa gaa gtt gaa gca gga aat gta gaa acc ttg att 240 

Gin Ser Leu He Lys Glu Val Glu Ala Gly Asn Val Glu Thr Leu He 
65 70 75 80 

gtt aag gat atg age cga ttg ggg cga aat tat ctg caa gta ggt ttt 288 

Val Lys Asp Met Ser Arg Leu Gly Arg Asn Tyr Leu Gin Val Gly Phe 
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85 90 95 

tat acg gaa gtt ctg ttt cca cag aaa aat gtc cgt ttc ctt gca att 336 

Tyr Thr Glu Val Leu Phe Pro Gin Lys Asn Val Arg Phe Leu Ala lie 

* 100 105 HO 

aac aac agt att gac agt aac aac get teg gat aat gac ttt get ccg 384 

Asn Asn Ser lie Asp Ser Asn Asn Ala Ser Asp Asn Asp Phe Ala Pro 
10 n 5 120 125 

ttt ttg aat att atg aac gaa tgg tat gee aaa gac aca age aac aaa 432 

Phe Leu Asn lie Met Asn Glu Trp Tyr Ala Lys Asp Thr Ser Asn Lys 
130 135 140 

ate aag get at a ttc gat gee cgt atg aaa gac ggaaag cgt tgt age 4 80 

He Lys Ala He Phe Asp Ala Arg Met Lys Asp Gly Lys Arg Cys Ser 

145 150 155 160 

ggt tct ate cct tat ggg tat aac cga ctg ccg age gac aaa caa acg 528 

Gly Ser He Pro Tyr Gly Tyr Asn Arg Leu Pro Ser Asp Lys Gin Thr 
165 170 175 

ctt gtg gtt gac cct gtg get teg gaa gtg gta aag cgt ate ttt act 576 

Leu Val Val Asp Pro Val Ala Ser Glu Val Val Lys Arg lie Phe Thr 

25 180 185 190 



15 



20 



ctt gee aat gat ggc aaa agt aca agg gca ate gca gaa at a ctg ace 624 
Leu Ala Asn Asp Gly Lys Ser Thr Arg Ala He Ala Glu He 'Leu Thr 
195 200 205 

gaa gaa aaa gtt tta acc cct gcg gca tac gca aag gaa tac cac ccc 672 
Glu Glu Lys Val Leu Thr Pro Ala Ala Tyr Ala Lys Glu Tyr His Pro 
210 215 " 220 

gaa cag tac aac ggc aac aag ttc aca aac cct tat ctt tgg gca atg 720 
Glu Gin Tyr Asn Gly Asn Lys Phe Thr Asn Pro Tyr Leu Trp Ala Met 
225 230 235 " 240 

tea acg ata aga aat att tta ggc agg cag gaa tat etc ggt cac acc 7 68 
Ser Thr He Arg Asn He Leu Gly Arg Gin Glu Tyr Leu Gly His Thr 
245 250 255 

gtt ttg cga aag teg gta age aca aat ttc aaa ctt cac aag aga aaa 816 
Val Leu Arg Lys Ser Val Ser Thr Asn Phe Lys Leu His Lys Arg Lys 
45 260 265 270 



30 



35 



40 



50 



55 



age aca gac gaa gaa gaa cag tat gta ttt ccg aat aca cac gag cct 864 

Ser Thr Asp Glu Glu Glu Gin Tyr Val Phe Pro Asn Thr His Glu Pro 
275 280 285 

ate ata teg cag gaa ctt tgg gac age gtt caa aaa cgc aga age aga 912 

He He Ser Gin Glu Leu Trp Asp Ser Val Gin Lys Arg Arg Ser Arg 
290 295 300 

gta aat cgt gee teg get tgg gga acg cac age aac cgt tta age gga 960 

Val Asn Arg Ala Ser Ala Trp Gly Thr His Ser Asn Arg Leu -Ser Gly 
305 310 315 " 320 

/:a !; at ttg tac tgt gcc gat tgc gga a ^ a a ^ a at 9 act tt; g cag aca cat 1008 

OU Tyr Leu Tyr Cys Ala Asp Cys Gly Arg Arg Met Thr Leu Gin Thr His 

325 330 335 

tac age aaa aaa gac ggt tct gtg cag tat tct tac cgt tgc ggt ggg 1056 

Tyr Ser Lys Lys Asp Gly Ser Val Gin Tyr Ser Tyr Arg Cys Gly Gly 
OD 340 345 350 

tat gca age aga gtg aac agt tgt acc agt cat teg att agt acc gat 1104 
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Tyr Ala Ser Arg Val Asn Ser Cys Thr Ser His Ser He Ser Thr Asp 
355 360 365 

aat gtt gaa gcc ttg ata tta tea tct gtc aaa cgc ttt tea agg ttt 1152 
5 Asn Val Glu Ala Leu He Leu Ser Ser Val Lys Arg Phe Ser Arg Phe 
370 375 380 

gtt ctg aat gat gaa caa gca ttt get ttg gaa ctg caa tct ctt tgg 1200 
Val Leu Asn Asp Glu Gin Ala Phe Ala Leu Glu Leu Gin Ser Leu Trp 
10 385 390 395 400 



15 



35 



55 



aat gaa aaa cag gag gaa aag ccg aaa cac aat caa teg gaa ctg caa 1248 

Asn Glu Lys Gin Glu Glu Lys Pro Lys His Asn Gin Ser Glu Leu Gin 
405 410 415 

cgc tgt cag aaa cgc tat gac gaa etc tct acc ctt gtt cgt ggc ttg 12 96 

Arg Cys Gin Lys Arg Tyr Asp Glu Leu Ser Thr Leu Val Arg Gly Leu 
420 425 430 



20 tat gaa aat ctt atg teg gga tta ctg ccc gaa aga cag tat aag caa 1344 
Tyr Glu Asn Leu Met Ser Gly Leu Leu Pro Glu Arg Gin Tyr Lys Gin . 
435 440 445 

ctg atg aaa cag tat gat gac gag cag gca gag ttg gaa acg aaa atg 1392 
25 Leu Met Lys Gin Tyr Asp Asp Glu Gin Ala Glu Leu Glu Thr Lys Met 
450 455 460 

gaa acg atg aaa aca gaa ctt gcc gaa gaa aaa gta agt tec gtt gat 14 40 
Glu Thr Met Lys Thr Glu Leu Ala Glu Glu Lys Val Ser Ser Val Asp 
30 465 '470 475 480 

att aag cat ttc att teg ctg ata cgc aag tgt aaa aat cct acg gaa 1488 
He Lys His Phe He Ser Leu lie Arg Lys Cys Lys Asn Pro Thr Glu 
485 490 495 



ate tec gat aca atg ttt aat gaa ctt gtt gat aag ata gtg gtt tat 1536 
He Ser Asp Thr Met Phe Asn Glu Leu Val Asp Lys He Val Val Tyr 
500 505 510 



40 gaa gca gag ggt gtg gga aaa gca cga aca caa aag gtc gat att tat 1584 
Glu Ala Glu Gly Val Gly Lys Ala Arg Thr Gin Lys Val Asp He Tyr 
515 520 525 

ttt aac tat gtc ggt caa gtg gat att gcc tat acc gaa gaa gaa ctt 1632 
45 Phe Asn Tyr Val Gly Gin Val Asp He Ala Tyr Thr Glu Glu Glu Leu 
530 535 540 

gcc gag ata gaa aca cag aaa gag cag gag gaa cag caa cgc ttg gca 1680 
Ala Glu He Glu Thr Gin Lys Glu Gin Glu Glu Gin Gin Arg Leu Ala 
50 • 545 550 555 560 

aga cag cgc aag cgt gaa aaa gcc tac cga gaa aag cga aag gca cag 1728 
Arg Gin Arg Lys Arg Glu Lys Ala Tyr Arg Glu Lys Arg Lys Ala Gin 
565 570 575 



aaa ate get gaa aac ggt ggc gaa ate gtt aag aca aag gtt tgc cct 1776 
Lys He Ala Glu Asn Gly Gly Glu He Val Lys Thr Lys Val Cys Pro 
580 585 590 



60 cat tgc aac aaa gag ttt ate ccg aca age aac cga cag gtg ttc tgt 1824 
His Cys Asn Lys Glu Phe He Pro Thr Ser Asn Arg Gin Val Phe Cys 
595 600 605 

tec aaa gag tgc tgc tat caa gca agg caa gac aaa aag aaa aca gac 1872 
65 Ser Lys Glu Cys Cys Tyr Gin Ala Arg Gin Asp Lys Lys Lys Thr Asp 
610 615 620 
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cga gaa gca gaa cga gga aat cac tat tac cga cag cgt gta tgt get 1920 
Arg Glu Ala Glu Arg Gly Asn His Tyr Tyr Arg Gin Arg Val Cys Ala 
625 630 635 J 640 

5 S*? H gc ggc aat tCC tat tgg cct aca cac a< ? c caa cag aaa ttc tgc 1968 
Val Cys Gly Asn Ser Tyr Trp Pro Thr His Ser Gin Gin Lys Phe Cys 
645 650 655 

10 c CC H? a 2f a tgt Caa agg gta aat cac aat aa< 3 aaa aca ttg gaa ttt 2016 
1U Ser Glu Glu Cys Gin Arg Val Asn His Asn Lys - Lys Thr Leu Glu Phe 
660 665 " 670 

tac cac cat aaa aaa gaa aag gag aag ctg caa tgc aaa gat tta tea 2064 
Tyr His His Lys Lys Glu Lys Glu Lys Leu Gin Cys Lys Asp Leu Ser 
& 675 680 685 

cag acg aaa gaa egg gta tec gat atg aac tta teg ggg act att act 2112 
Gin Thr Lys Glu Arg Val Ser Asp Met Asn Leu Ser Gly Thr lie Thr 
690 695 700 

acc cct get taa oioa 
Thr Pro Ala ^ 
705 

<210> 61 
<211> 707 
<212> PRT 

<213> Transposes Tn4451 
<400> 61 

Met Ser Arg Thr Ser Arg lie Thr Ala Leu Tyr Glu Arg Leu Ser Arg 
1 5 10 15 

35 Asp Asp Asp Leu Thr Gly Glu Ser Asn Ser He Thr Asn Gin Lys Lys 
20 25 30 

Tyr Leu Glu Asp Tyr Ala Arg Arg Asn Gly Phe Glu Asn He Arg His 
35 40 45 

Phe Thr Asp Asp Gly Phe Ser Gly Val Asn Phe Asn Arg Pro Gly Phe 
50 55 60 

Gin Ser Leu He Lys Glu Val Glu Ala Gly Asn Val Glu Thr Leu He 
45 65 70 75 . 80 

Val Lys Asp Met Ser Arg Leu Gly Arg Asn Tyr Leu Gin Val Gly Phe 
85 90 95 

50 Tyr Thr Glu Val Leu Phe Pro Gin Lys Asn Val Arg Phe Leu Ala He 
100 105 no 

Asn Asn Ser He Asp Ser Asn Asn Ala Ser Asp Asn Asp Phe Ala Pro 
115 120 125 

Phe Leu Asn He Met Asn Glu Trp Tyr Ala Lys Asp Thr Ser Asn Lys 
130 135 140 

*n ?ic LyS Ala Ile Phe Asp Ala Arg Met L y s As P G1 Y L Y S Arg c y s Ser 
ou 14b 150 155 i 6 o 

Gly Ser He Pro Tyr Gly Tyr Asn Arg Leu Pro Ser Asp Lys Gin Thr 
165 170 175 

65 Leu Val Val Asp Pro Val Ala Ser Glu Val Val Lys Arg He Phe Thr 
180 185 ' 190 



40 



55 
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Leu Ala Asn Asp Gly Lys Ser Thr Arg Ala lie Ala Glu He Leu Thr 
195 200 205 

Glu Glu Lys Val Leu Thr Pro Ala Ala Tyr Ala Lys Glu Tyr His Pro 
210 215 220 

Glu Gin Tyr Asn Gly Asn Lys Phe Thr Asn Pro Tyr Leu Trp Ala Met 
225 230 235 * 240 

Ser Thr He Arg Asn He Leu Gly Arg Gin Glu Tyr Leu Gly His Thr 
245 250 255 

Val Leu Arg Lys Ser Val Ser Thr Asn Phe Lys Leu His Lys Arg Lys 
260 265 270 

Ser Thr Asp Glu Glu Glu Gin Tyr Val Phe Pro Asn Thr His Glu Pro 
275 280 285 

He lie Ser Gin Glu Leu Trp Asp Ser Val Gin Lys Arg Arg Ser Ara 
290 295 300 

Val Asn Arg Ala Ser Ala Trp Gly Thr His Ser Asn Arg Leu Ser Glv 
305 310 315 320 

Tyr Leu Tyr Cys Ala Asp Cys Gly Arg Arg Met Thr Leu Gin Thr His 
325 330 335 

Tyr Ser Lys Lys Asp Gly Ser Val Gin Tyr Ser Tyr Arg Cys Gly Gly 
340 345 ~ 350 

Tyr Ala Ser Arg Val Asn Ser Cys Thr Ser His Ser He Ser Thr Asp 
355 360 365 

Asn Val Glu Ala Leu He Leu Ser Ser Val Lys Arg Phe Ser Ara Phe 
370 375 380 

Val Leu Asn Asp Glu Gin Ala Phe Ala Leu Glu Leu Gin Ser Leu Trp 
385 390 395 400 

Asn Glu Lys Gin Glu Glu Lys Pro Lys His Asn Gin Ser Glu Leu Gin 
405 410 415 

Arg Cys Gin Lys Arg Tyr Asp Glu Leu Ser Thr Leu Val Arg Gly Leu 
420 425 430 

Tyr Glu Asn Leu Met Ser Gly Leu Leu Pro Glu Arg Gin Tyr Lys Gin 
435 440 445 

Leu Met Lys Gin Tyr Asp Asp Glu Gin Ala Glu Leu Glu Thr Lys Met 
450 455 460 

Glu Thr Met Lys Thr Glu Leu Ala Glu Glu Lys Val Ser Ser Val Asp 
465 470 475 480 

He Lys His Phe He Ser Leu He Arg Lys Cys Lys Asn Pro Thr Glu 
485 490 495 

He Ser Asp Thr Met Phe Asn Glu Leu Val Asp Lys He Val Val Tyr 
500 505 510 

Glu Ala Glu Gly Val Gly Lys Ala Arg Thr Gin Lys Val Asp lie Tyr 
515 520 525 

Phe Asn Tyr Val Gly Gin Val Asp He Ala Tyr Thr Glu Glu Glu Leu 
530 535 540 

Ala Glu He Glu Thr Gin Lys Glu Gin Glu Glu Gin Gin Arg Leu Ala 
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545 "0 555 560 

Arg Gin Arg Lys Arg Glu Lys Ala Tyr Arg Glu Lys Arg Lys Ala Gin 
5 565 570 " 575 

Lys He Ala Glu Asn Gly Gly Glu He Val Lys Thr Lys Val Cys Pro 
580 585 590 

His Cys Asn Lys Glu Phe lie Pro Thr Ser Asn Arg Gin Val Phe Cys 
LKJ 595 600 605 

Ser Lys Glu Cys Cys Tyr Gin Ala Arg Gin Asp Lys Lys Lys Thr Asp 
610 615 620 

15 Arg Glu Ala Glu Arg Gly Asn His Tyr Tyr Arg Gin Arg Val Cys Ala 
625 630 635 " 640 



20 



45 



50 



60 



65 



Val Cys Gly Asn Ser Tyr Trp Pro Thr His Ser Gin Gin Lys Phe Cys 
645 650 655 

Ser Glu Glu Cys Gin Arg Val Asn His Asn Lys Lys Thr Leu Glu Phe 
660 665 670 



Tyr His His Lys Lys Glu Lys Glu Lys Leu Gin Cys Lys Asp Leu Ser 
ZD 675 680 685 

Gin Thr Lys Glu Arg Val Ser Asp Met Asn Leu Ser Gly Thr He Thr 
690 695 700 

30 Thr Pro Ala 
705 



35 <210> 62 

<211> 1420 
<212> DNA 

<213> XisA recombinase 

40 <220> 

<221> CDS 

<222> (1)..(1416) 



<400> 62 

atg caa aat cag ggt caa gac aaa tat caa caa gcc ttt gca gac tta 

Met Gin Asn Gin Gly Gin Asp Lys Tyr Gin Gin Ala Phe Ala Asp Leu 
15 10 15 

gag cca ctt tea tct acc gac ggc agt ttt etc ggc tea agt ctg caa 
Glu Pro Leu Ser Ser Thr Asp Gly Ser Phe Leu Gly Ser Ser Leu Gin 
.20 25 " 30 

gca cag cag caa aga gaa cac atg aga aca aaa gta eta caa gac eta 
Ala Gin Gin Gin Arg Glu His Met Arg Thr Lys Val Leu Gin Asp Leu 
55 35 40 45 

gac aag gta aat ctg cgt ttg aag tct gca aag acg aaa gtc tea gtt 
Asp Lys Val Asn Leu Arg Leu Lys Ser Ala Lys Thr Lys Val Ser Val 
50 55 60 

cga gaa tct aac gga agt ctg caa tta cga gca acg tta cca att aaa 
Arg Glu Ser Asn Gly Ser Leu Gin Leu Arg Ala Thr Leu Pro He Lys 
6 * 70 75 80 



48 



96 



144 



192 



240 



cct gga gat aag gac acc aac ggt aca ggc aga aag caa tac aat etc 288 
Pro Gly Asp Lys Asp Thr Asn Gly Thr Gly Arg Lys Gin Tyr Asn Leu 
85 90 95 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



age ttg aat ate cct gca aac ttg gat gga ctg aag acg get gag gaa 336 
Ser Leu Asn He Pro Ala Asn Leu Asp Gly Leu Lys Thr Ala Glu Glu 
100 105 110 

gaa get tat gaa tta ggt aaa tta ate get egg aaa ace ttt gaa tgg 384 
Glu Ala Tyr Glu Leu Gly Lys Leu He Ala Arg Lys Thr Phe Glu Trp 
115 120 125 

aat gat aaa tat tta ggc aaa gaa gee act aaa aaa gat tea caa aca 432 
Asn Asp Lys Tyr Leu Gly Lys Glu Ala Thr Lys Lys Asp Ser Gin Thr 
130 135 140 

ata ggt gat tta eta gaa aaa ttt gca gaa gag tat ttt aaa ace cat 480 
He Gly Asp Leu Leu Glu Lys Phe Ala Glu Glu Tyr Phe Lys Thr His 
145 150 155 160 

aaa cgc ace act aaa age gaa cat ace ttt ttt tac tat ttt tec cgc 528 
Lys Arg Thr Thr Lys Ser Glu His Thr Phe Phe Tyr Tyr Phe Ser Arg 
165 170 175 

ace caa cga tat ace aat tec aaa gat tta gca acg gcg gaa aat etc 57 6 
Thr Gin Arg Tyr Thr Asn Ser Lys Asp Leu Ala Thr Ala Glu Asn Leu 
180 185 190 

ate aat tea att gag caa ate gat aaa gaa tgg gcg aga tat aat gee 624 . 
He Asn Ser He Glu Gin He Asp Lys Glu Trp Ala Arg Tyr Asn Ala 
195 200 205 

gee aga gee ata tea get ttt tgc ata aca ttc aat ata gaa att gat 672 
Ala Arg Ala He Ser Ala Phe Cys He Thr Phe Asn He Glu He Asp 
210 215 220 

ttg tec cag tat tec aaa atg cct gat cgc aat teg cgc aac ate ccc 720 
Leu Ser Gin Tyr Ser Lys Met Pro Asp Arg Asn Ser Arg Asn He Pro 
22 5 230 235 240 

aca gat gca gaa ata eta tea gga att ace aaa ttt gaa gac tat . eta 7 68 
Thr Asp Ala Glu He Leu Ser Gly He Thr Lys Phe Glu Asp Tyr Leu 
245 250 255 

gtt ace aga gga aat caa gtt aat gaa gat gta aaa gat age tgg caa 816 
Val Thr Arg Gly Asn Gin Val Asn Glu Asp Val Lys Asp Ser Trp Gin 
260 265 270 

ctt tgg cgc tgg aca tat gga atg tta gca gtt ttt ggt tta cgc ccc 864 
Leu Trp Arg Trp Thr Tyr Gly Met Leu Ala Val Phe Gly Leu Arg Pro 
275 280 285 

agg gaa att ttt att aac cct aat att gat tgg tgg tta age aaa gag 912 
Arg Glu lie Phe lie Asn Pro Asn lie Asp Trp Trp Leu Ser Lys Glu 
290 295 300 

aat ata gac etc aca tgg aaa gta gac aaa gaa tgt aaa act ggt gaa 960 
Asn lie Asp Leu Thr Trp Lys Val Asp Lys Glu Cys Lys Thr Gly Glu 
305 310 315 * 320 

aga caa gca tta ccc tta cat aaa gaa tgg att gat gag ttt gat tta 1008 
Arg Gin Ala Leu Pro Leu His Lys Glu Trp He Asp Glu Phe Asp Leu 
325 330 335 

aga aat ccg aaa tat tta gaa atg ctg gca aca gca att agt aaa aaa 1056 
Arg Asn Pro Lys Tyr Leu Glu Met Leu Ala Thr Ala lie Ser Lys Lys 
340 345 350 

gat aaa aca aat cat get gaa ata aca gee tta act cag cgt att agt 1104 
Asp Lys Thr Asn His Ala Glu lie Thr Ala Leu Thr Gin Arg He Ser 
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355 360 365 

tgg tgg ttt egg aaa gtc gaa tta gat ttt aaa ccc tat gat tta cgt 1152 
Trp Trp Phe Arg Lys Val Glu Leu Asp Phe Lys Pro Tyr Asp Leu Arg 
370 375 380 

cac gec tgg gca ate aga gcg cat att tta ggc ata cca ate aaa gcg 1200 
His Ala Trp Ala lie Arg Ala His lie Leu Gly He Pro He Lys Ala 
385 390 395 400 

gcg get gat aat ttg ggg cat agt atg cag gtt cat aca caa acc tat 1248 
Ala Ala Asp Asn Leu Gly His Ser Met Gin Val His Thr Gin Thr Tyr 
405 410 415 

cag cgc tgg ttc teg eta gat atg egg aag tta gcg att aat cag get 1296 
Gin Arg Trp Phe Ser Leu Asp Met Arg Lys Leu Ala He Asn Gin Ala 
420 425 430 

ttg act aag agg aat gaa ttt gag gtg att agg gag gag aat get aaa 1344 
Leu Thr Lys Arg Asn Glu Phe Glu Val He Arg Glu Glu Asn Ala Lys 
435 440 445 

ttg cag ata gaa aat gaa agg ttg agg atg gaa att gag aag tta aag 1392 
Leu Gin He Glu Asn Glu Arg Leu Arg Met Glu He Glu Lys Leu Lys 
450 455 460 

atg gaa ata get tat aag aat agt tgag 1420 
Met Glu He Ala Tyr Lys Asn Ser 
465 470 

<210> 63 
<211> 472 
<212> PRT 

<213> XisA recombinase 
<400> 63 

Met Gin Asn Gin Gly Gin Asp Lys Tyr Gin Gin Ala Phe Ala Asp Leu 
15 10 15 

Glu Pro Leu Ser Ser Thr Asp Gly Ser Phe Leu Gly Ser Ser Leu Gin 
20 25 - 30 

Ala Gin Gin Gin Arg Glu His Met Arg Thr Lys Val Leu Gin Asp Leu 
35 40 45 

Asp Lys Val Asn Leu Arg Leu Lys Ser Ala Lys Thr Lys Val Ser Val 
50 55 60 

Arg Glu Ser Asn Gly Ser Leu Gin Leu Arg Ala Thr Leu Pro He Lys 
65 70 75 80 

. Pro Gly Asp Lys Asp Thr Asn Gly Thr Gly Arg Lys Gin Tyr Asn Leu 
85 90 * 95 

Ser Leu Asn He Pro Ala Asn Leu Asp Gly Leu Lys Thr Ala Glu Glu 
100 105 110 

Glu Ala Tyr Glu Leu Gly Lys Leu He Ala Arg Lys Thr Phe Glu Trp 
115 120 125 

Asn Asp Lys Tyr Leu Gly Lys Glu Ala Thr Lys Lys Asp Ser Gin Thr 
130 135 140 

He Gly Asp Leu Leu Glu Lys Phe Ala Glu Glu Tyr Phe Lys Thr His 
145 150 155 * 160 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

48 

Lys Arg Thr Thr Lys Ser Glu His Thr Phe Phe Tyr Tyr Phe Ser Arg 
165 170 175 

Thr Gin Arg Tyr Thr Asn Ser Lys Asp Leu Ala Thr Ala Glu Asn Leu 
* 180 185 190 

lie Asn Ser He Glu Gin He Asp Lys Glu Trp Ala Arg Tyr Asn Ala 
195 200 * 205 

10 Ala Arg Ala He Ser Ala Phe Cys He Thr Phe Asn He Glu He Asp 
210 215 220 



15 



Leu Ser Gin Tyr Ser Lys Met Pro Asp Arg Asn Ser Arg Asn He Pro 

225 230 235 240 

Thr Asp Ala Glu He Leu Ser Gly He Thr Lys Phe Glu Asp Tyr Leu 

245 250 255 



Val Thr Arg Gly Asn Gin Val Asn Glu Asp Val Lys Asp Ser Trp Gin 
20 260 265 270 

Leu Trp Arg Trp Thr Tyr Gly Met Leu Ala Val Phe Gly Leu Arg Pro 
275 280 285 

25 Arg Glu He Phe He Asn Pro Asn He Asp Trp Trp Leu Ser Lys Glu 
290 295 300 



30 



Asn He Asp Leu Thr Trp Lys Val Asp Lys Glu Cys Lys Thr Gly Glu 

305 310 315 " 320 

Arg Gin Ala Leu Pro Leu His Lys Glu Trp He Asp Glu Phe Asp Leu 
325 330 335 



Arg Asn Pro Lys Tyr Leu Glu Met' Leu Ala Thr Ala He Ser Lys Lys 
35 340 345 350 

Asp Lys Thr Asn* His Ala Glu He Thr Ala Leu Thr Gin Arg He Ser 
355 360 365 

40 Trp Trp Phe Arg Lys Val Glu Leu Asp Phe Lys Pro Tyr Asp Leu Arg 

370 375 380 



45 



His Ala Trp Ala He Arg Ala His He Leu Gly He Pro He Lys Ala 
385 390 395 400 

Ala Ala Asp Asn Leu Gly His Ser Met Gin Val His Thr Gin Thr Tyr 

405 410 415 



Gin Arg Trp Phe Ser Leu Asp Met Arg Lys Leu Ala He Asn Gin Ala 
50 420 425 430 

Leu Thr Lys Arg Asn Glu Phe Glu Val He Arg Glu Glu Asn Ala Lys 
435 440 445 

55 Leu Gin lie Glu Asn Glu Arg Leu Arg Met Glu He Glu Lys Leu Lys 
450 455 460 

Met Glu He Ala Tyr Lys Asn Ser 
465 470 

60 



<210> 64 
<211> 1008 
65 <212> DNA 

<213> Artificial Sequence 
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<220> 

<221> CDS 

<222> (1) . . (1005) 

<220> 

<223> Description of Artificial Sequence: vector 
PBS-SSV3 

<400> 64 

atg acg aaa gat aag acg cgt tat aaa tac ggg gat tat att tta cgc 48 

Met Thr Lys Asp Lys Thr Arg Tyr Lys Tyr Gly Asp Tyr lie Leu Arg 
1 5 10 15 

gag agg aaa ggg egg tat tat gtt tac aag eta gag tat gaa aac ggt 96 
Glu Arg Lys Gly Arg Tyr Tyr Val Tyr Lys Leu Glu Tyr Glu Asn Glv 
20 25 30 



gag gta aaa gag cgt tac gtg ggt cct tta get gac gtc gtt gaa tea 144 
Glu Val Lys Glu Arg Tyr Val Gly Pro Leu Ala Asp Val Val Glu Ser 
*V 35 40 45 



tat eta aaa atg aaa tta ggg gtc gta ggg gat act ccc eta caa gcg 192 
Tyr Leu Lys Met Lys Leu Gly Val Val Gly Asp Thr Pro Leu Gin Ala 

50 . 55 ' 60 

gat ccc ccc ggt ttc gag ccc ggg aca age gga age ggt ggt gga aaa 240 
Asp Pro Pro Gly Phe Glu Pro Gly Thr Ser Gly Ser Gly Gly Gly Lys 
65 70 75 ~ A 80 

gag gga act gaa cga cgt aaa ata gcg ttg gtt gec aat ttg cgc caa 288 
Glu Gly Thr Glu Arg Arg Lys He Ala Leu Val Ala Asn Leu Arg Gin 
85 90 95 

tac gcg acg gac o;gc aac ata aag gcg ttc tac aac tat etc atg aac 336 
Tyr Ala Thr Asp Gly Asn He Lys Ala Phe Tyr Asn Tyr Leu Met Asn 
100 105 no 

gaa agg ggg ata age gaa aaa act gca aag gac tac ate aat get ata 384 
Glu Arg Gly He Ser Glu Lys Thr Ala Lys Asp Tyr He Asn Ala He 
40 115 120 125 

tea aag ccg tat aaa gag acg aga gac gca cag aag get tac cga etc "432 
Ser Lys Pro Tyr Lys Glu Thr Arg Asp Ala Gin Lys Ala Tyr Arg Leu 
45 13 ° 135 140 

ttt gca cgt ttc tta gcg tea cgc aat ate ata cat gat gaa ttt gcg 480 
Phe Ala Arg Phe Leu Ala Ser Arg Asn He He His Asp Glu Phe Ala 
145 150 155 160 

50 gat aaa ata ttg aaa gcg gta aag gtg aag aag. gcg aac get gat ate 528 
Asp Lys He Leu Lys Ala Val Lys Val Lys Lys Ala Asn Ala Asp He 
165 170 175 

* aC att Cca acg ttg gaa gag ata aaa a TO ac< ? tta caa tta gca aaa 57 6 
33 Tyr He Pro Thr Leu Glu Glu He Lys Arg Thr Leu Gin Leu Ala Lys 
180 185 190 

gac tat age gaa aac gtc tac ttc ate tac cgt ate get etc gag teg 624 
Asp Tyr Ser Glu Asn Val Tyr Phe He Tyr Arg He Ala 
60 195 200 205 

ctg aaa gtg ctg aag 
Leu Lys Val Leu Lys 
210 215 " 220 



Asp Tyr Ser Glu Asn Val Tyr Phe He Tyr Arg He Ala Leu Glu Ser 
195 200 ** J 205 

ggc gtt agg ctg age gaa ata ctg aaa gtg ctg aag gaa ccc gaa agg 672 

Gly Val Arg Leu Ser Glu He Leu Lys Val Leu Lys Glu Pro Glu Arg 

210 215 220 

gac att tgc ggt aac gac gtc tgt tat tat ccg ctt agt tgg act agg 720 

Asp He Cys Gly Asn Asp Val Cys Tyr Tyr Pro Leu Ser Trp Thr Arg 
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225 230 235 240 

gga tat aag .ggc gtc ttc tat gta ttc cac ata acg cct ctg aag aga 7 68 
Gly Tyr Lys Gly Val Phe Tyr Val Phe His He Thr Pro Leu Lys Arg 
5 245 250 255 

gta gag gtg acg aag tgg gca ata gcg gac ttt gaa cga cgt cat aag 816 
Val Glu Val Thr Lys Trp Ala He Ala Asp Phe Glu Arg Arg His Lys 
1Q 260 265 270 

gac get ata gcg ata aag tac ttc cgc aaa ttc gta gcg tct aag atg 864 
Asp Ala He Ala He Lys Tyr Phe Arg Lys Phe Val Ala Ser Lys Met 
275 280 ■ 285 

15 get gag eta age gta ccg tta gat att ate gat ttt att caa ggg cgt 912 
Ala Glu Leu Ser Val Pro Leu Asp He He Asp Phe He Gin Gly Arg 
290 295 300 

aaa ccg aca cgc gtt tta acg caa cat tac gta teg etc ttc ggc ata 960- 
M Lys Pro Thr Arg Val Leu Thr Gin His Tyr Val Ser Leu Phe Gly He 
305 310 315 320 

gcg aaa gag caa tat aaa aag tat gcg gaa tgg eta aaa ggg gtc tga 1008 
Ala Lys Glu Gin Tyr Lys Lys Tyr Ala Glu Trp Leu Lys Gly Val 
25 325 330 335 

<210> 65 
<211> 335 
30 <212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence: vector 
PBS-SSV3 

35 <400> 65 

Met Thr Lys Asp Lys Thr Arg Tyr Lys Tyr Gly Asp Tyr He Leu Arg 
1 5 10 15 

Glu Arg Lys Gly Arg Tyr Tyr Val Tyr Lys Leu' Glu Tyr Glu Asn Gly 
40 20 25 30 

Glu Val Lys Glu Arg Tyr Val Gly Pro Leu Ala Asp Val Val Glu Ser 
35 40 45 

45 Tyr Leu Lys Met Lys Leu Gly Val Val Gly Asp Thr Pro Leu Gin Ala 
50 55 60 

Asp Pro Pro Gly Phe Glu Pro Gly Thr Ser Gly Ser Gly Gly Gly Lys 
5Q 65 70 75 80 

Glu Gly Thr Glu Arg Arg Lys He Ala Leu Val Ala Asn Leu Arg Gin 
85 90 95 

Tyr Ala Thr Asp Gly Asn He Lys Ala Phe Tyr Asn Tyr Leu Met Asn 
53 100 105 110 

Glu Arg Gly lie Ser Glu Lys Thr Ala Lys Asp Tyr He Asn Ala He 
115 120 125 

60 Ser Lys Pro Tyr Lys Glu Thr Arg Asp Ala Gin Lys Ala Tyr Arg Leu 
130 135 140 

Phe Ala Arg Phe Leu Ala Ser Arg Asn He He His Asp Glu Phe Ala 
_ 14 $ 150 155 ^ 160 

65 

Asp Lys He Leu Lys Ala Val Lys Val Lys Lys Ala Asn Ala Asp He 
165 170 175 
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Tyr He Pro Thr Leu Glu Glu He Lys Arg Thr Leu Gin Leu Ala Lys 
180 185 190 

5 Asp Tyr Ser Glu Asn Val Tyr Phe He Tyr Arg He Ala Leu Glu Ser 
195 200 205 



10 



25 



40 



55 



60 



65 



Gly Val Arg Leu Ser Glu He Leu Lys Val Leu Lys Glu Pro Glu Arg 
210 215 220 

Asp He Cys Gly Asn Asp Val Cys Tyr Tyr Pro Leu Ser Trp Thr Arq 

225 230 235 240 



Gly Tyr Lys Gly Val Phe Tyr Val Phe His He Thr Pro Leu Lys Arg 
15 245 250 255 

Val Glu Val Thr Lys Trp Ala He Ala Asp Phe Glu Arg Arg His Lys 

260 265 270 

20 Asp Ala He Ala He Lys Tyr Phe Arg Lys Phe Val Ala Ser Lys Met 

275 280 285 



Ala Glu Leu Ser Val Pro Leu Asp He He Asp Phe He Gin Gly Arg 
290 295 . 300 

Lys Pro Thr Arg Val Leu Thr Gin His Tyr Val Ser Leu Phe Gly He 

3 °5 310 315 320 



Ala Lys Glu Gln Tyr Lys Lys Tyr Ala Glu Tr P Leu L y s G1 V Val 

30 325 330 J 335 



<210> 66 
35 <211> 1441 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein NLS-XisA 



<220> 
<221> CDS 
45 <222> (1) . . (1437) 

<400> 66 

atg ccc aag aag aag agg aag gtg caa aat cag ggt caa gac aaa tat 48 
Met Pro Lys Lys Lys Arg Lys Val Gin Asn Gin Gly Gin Asp Lys Tyr 
50 1 5 10 15 

caa caa gcc ttt gca gac tta gag cca ctt tea tct acc gac ggc agt 96 
Gin Gin Ala Phe Ala Asp Leu Glu Pro Leu Ser Ser Thr Asp Gly Ser 
20 25 30 



ttt etc ggc tea agt ctg caa gca cag cag caa aga gaa cac atg aga 14 4 

Phe Leu Gly Ser Ser Leu Gin Ala Gin Gin Gin Arg Glu His Met Arg 

35 40 45 

aca aaa gta eta caa gac eta gac aag gta aat ctg cgt ttg aag tct 192 

Thr Lys Val Leu Gin Asp Leu Asp Lys Val Asn Leu Arg Leu Lys Ser 

50 55 60 

gca aag acg aaa gtc tea gtt cga gaa tct aac gga agt ctg caa tta 240 

Ala Lys Thr Lys Val Ser Val Arg Glu Ser Asn Gly Ser Leu Gin Leu 

65 . 70 75 80 
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tin a?* ^ 9 r ta ^ ^ 333 CCt gga gat aag gac acc aac TOt aca 288 

Arg Ala Thr Leu Pro lie .Lys Pro Gly Asp Lys Asp Thr Asn Gly Thr 

85 90 95 

ggc aga aag caa tac aat etc age ttg aat ate- cct gca aac ttg gat 
Gly Arg Lys Gin Tyr Asn Leu Ser Leu Asn He Pro Ala Asn Leu Asp 
100 105 no 

gga ctg aag acg get gag gaa gaa get tat gaa tta ggt aaa tta ate 
Gly Leu Lys Thr Ala Glu Glu Glu Ala Tyr Glu Leu Gly Lys Leu He 
115 120 ' 125 

get egg aaa acc ttt gaa tgg aat gat aaa tat tta ggc aaa gaa gcc 432 
Ala Arg Lys Thr Phe Glu Trp Asn Asp Lys Tyr Leu Gly Lys Glu Ala 
130 135 140 

act aaa aaa gat tea caa aca ata ggt gat tta eta gaa aaa ttt gca 4 80 
Thr Lys Lys Asp Ser Gin Thr He Gly Asp Leu Leu Glu Lys Phe Ala 
145 150 155 160 

gaa gag tat ttt aaa acc cat aaa cgc acc act aaa age gaa cat acc 528 
Glu Glu Tyr Phe Lys Thr His Lys Arg Thr Thr Lys Ser Glu His Thr 
165 ivo 175 

ttt ttt tac tat ttt tec cgc acc caa cga tat acc aat tec aaa gat 576 
Phe Phe Tyr Tyr Phe Ser Arg Thr Gin Arg Tyr Thr Asn Ser Lys Asp 
180 185 190 

tta gca acg gcg gaa aat etc ate aat tea att gag caa ate gat aaa 624 
Leu Ala Thr Ala Glu Asn Leu He Asn Ser He Glu Gin He Asp Lvs 
195 200 205 ' 

gaa tgg gcg aga tat aat gcc gcc aga gcc ata tea get ttt tgc ata 672 
Glu Trp Ala Arg Tyr Asn Ala Ala Arg Ala He Ser Ala Phe Cys He 
210 215 220 

aca ttc aat ata gaa att gat ttg tec cag tat tec aaa atg cct gat 720 
Thr Phe Asn He Glu He Asp Leu Ser Gin Tyr Ser Lys Met Pro Asp 
225 230 235 



240 

cgc aat teg cgc aac ate ccc aca gat gca gaa ata eta tea gga att 
Arg Asn Ser Arg Asn He Pro Thr Asp Ala Glu He Leu Ser Gly He 
245 250 255 

acc aaa ttt gaa gac tat eta gtt acc aga gga aat caa gtt aat gaa 
Thr Lys Phe Glu Asp Tyr Leu Val Thr Arg Gly Asn Gin Val Asn Glu 
26 0 265 270 

gat gta aaa gat age tgg caa ctt tgg cgc tgg aca tat gga atg tta 
Asp Val Lys Asp Ser Trp Gin Leu Trp Arg Trp Thr Tyr Gly Met Leu 
275 280 285 

?^ ggt tta cgc ccc agg gaa att ttt: att aac cct aat att 
ooi hS Gly LeU Arg Pro Ar 9 Glu Ile Phe He Asn Pro Asn He 
290 295 300 

m" £ gg ^ ta ag ° aaa gag aat ata *ac etc aca tgg aaa gta gac 
Asp Trp Trp Leu Ser Lys Glu Asn He Asp Leu Thr Trp Lys Val Asp 
305 310 315 320 

aaa gaa tgt aaa act ggt gaa aga caa gca tta ccc tta cat aaa gaa 1008 
Lys Glu Cys Lys Thr Gly Glu Arg Gin Ala Leu Pro Leu His Lys Glu 
325 330 335 

tgg att gat gag ttt gat tta aga aat ccg aaa tat tta gaa atg ctg 1056 
Trp He Asp Glu Phe Asp Leu Arg Asn Pro Lys Tyr Leu Glu Met Leu 
340 345 350 



768 



816 



864 



912 



960 
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53 



gca aca gca att agt aaa aaa gat aaa aca aat cat get gaa ata aca 
Ala Thr Ala lie Ser Lys Lys Asp Lys Thr Asn His Ala Glu lie Thr 
355 360 365 



1104 



gec tta act cag cgt att agt tgg tgg ttt egg aaa gtc gaa tta gat 
Ala Leu Thr Gin Arg He Ser Trp Trp Phe Arg Lys Val Glu Leu Asp 
370 375 380 



1152 



10 ttt aaa ccc tat gat tta cgt cac gec tgg gca ate aga gcg cat att 1200 

Phe Lys Pro Tyr Asp Leu Arg His Ala' Trp Ala He Arg Ala His He 

385 390 395 400 

tta ggc ata cca ate aaa gcg gcg get gat aat ttg ggg cat agt atg 124 8 

15 Leu Gly He Pro He Lys Ala Ala Ala Asp Asn Leu Gly His Ser Met 

405 410 415 

cag gtt cat aca caa acc tat cag cgc tgg ttc teg eta gat atg egg 1296 

Gin Val His Thr Gin Thr Tyr Gin Arg Trp Phe Ser Leu Asp Met Arg 
20 420 425 430 



25 



aag tta gcg att aat cag get ttg act aag agg aat gaa ttt gag gtg 
Lys Leu Ala He Asn Gin Ala Leu Thr Lys Arg Asn Glu Phe Glu Val 
435 440 445 



1344 



att agg gag gag aat get aaa ttg cag ata gaa aat gaa agg ttg agg 1392 
He Arg Glu Glu Asn Ala Lys Leu Gin He Glu Asn Glu Arg Leu Arg 
450 455 460 



30 atg gaa att gag aag tta aag atg gaa ata get tat aag aat agt tgag 1441 
• Met Glu He Glu Lys Leu Lys Met Glu He Ala Tyr Lys Asn Ser 



465 



470 



475 



35 <210> 67 
<211> 479 
<212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence: DNA sequence 
40 coding for fusion protein NLS-XisA 



45 



50 



<400> 67 

Met Pro Lys Lys Lys Arg Lys Val Gin Asn Gin Gly Gin Asp Lys Tyr 
1 5 10 15 

Gin Gin Ala Phe Ala Asp Leu Glu Pro Leu Ser Ser Thr Asp Gly Ser 
20 25 30 

Phe Leu Gly Ser Ser Leu Gin Ala Gin Gin Gin Arg Glu His Met Arg 
35 40 45 



Thr Lys Val Leu Gin Asp Leu Asp Lys Val Asn Leu Arg Leu Lys Ser 
50 55 60 

55 Ala Lys Thr Lys Val Ser Val Arg Glu Ser Asn Gly Ser Leu Gin Leu 

65 70 75 80 



60 



Arg Ala Thr Leu Pro He Lys Pro Gly Asp Lys Asp Thr Asn Gly Thr 
85 90 95 

Gly Arg Lys Gin Tyr Asn Leu Ser Leu Asn He Pro Ala Asn Leu Asp 

100 105 110 



Gly Leu Lys Thr Ala Glu Glu Glu Ala Tyr Glu Leu Gly Lys Leu He 
65 115 120 " 125 



Ala Arg Lys Thr Phe Glu Trp Asn Asp Lys Tyr Leu Gly Lys Glu Ala 
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130 135 140 



Thr Lys Lys Asp Ser Gin Thr He Gly Asp Leu Leu Glu Lys Phe Ala 
145 150 155 160 



Glu Glu Tyr Phe Lys Thr His Lys Arg Thr Thr Lys Ser Glu His Thr 
. 165 170 175 

Phe Phe Tyr Tyr Phe Ser Arg Thr Gin Arg Tyr Thr Asn Ser Lys Asp 
180 185 190 

Leu Ala Thr Ala Glu Asn Leu He Asn Ser He Glu Gin lie Asp Lys 
195 200 205 

Glu Trp Ala Arg Tyr Asn Ala Ala Arg Ala He Ser Ala Phe Cys He 
210 215 220 

Thr Phe Asn He Glu lie Asp Leu Ser Gin Tyr Ser Lys Met Pro Asp 
225 230 235 240 

Arg Asn Ser Arg Asn He Pro Thr Asp Ala Glu He Leu Ser Gly He 
245 250 255 

Thr Lys Phe Glu Asp Tyr Leu Val Thr Arg Gly Asn Gin Val Asn Glu 
260 265 270 

Asp Val Lys Asp Ser Trp Gin Leu Trp Arg Trp Thr Tyr Gly Met Leu 
275 280 285 

Ala Val Phe Gly Leu Arg Pro Arg Glu He Phe He Asn Pro Asn He 
290 295 300 

Asp Trp Trp Leu Ser Lys Glu Asn He Asp Leu Thr Trp Lys Val Asp 
305 310 315 320 

Lys Glu Cys Lys Thr Gly Glu Arg Gin Ala Leu Pro Leu His Lys Glu 
325 330 335 

Trp He Asp Glu Phe Asp Leu Arg Asn Pro Lys Tyr Leu Glu Met Leu 
340' 345 350 

Ala Thr Ala He Ser Lys Lys Asp Lys Thr Asn His Ala Glu He Thr 
355 360 365 

Ala Leu Thr Gin Arg He Ser Trp Trp Phe Arg Lys Val Glu Leu Asp 
370 375 380 

Phe Lys Pro Tyr Asp Leu Arg His Ala Trp Ala He Arg Ala His lie 
385 390 395 400 

Leu Gly He Pro He Lys Ala Ala Ala Asp Asn Leu Gly His Ser Met 
405 410 " 415 

Gin Val His Thr Gin Thr Tyr Gin Arg Trp Phe Ser Leu Asp Met Arg 
420 425 430 

Lys Leu Ala He Asn Gin Ala Leu Thr Lys Arg Asn Glu Phe Glu Val 
435 440 445 

He Arg Glu Glu Asn Ala Lys Leu Gin He Glu Asn Glu Arg Leu Arq 
450 455 460 

Met Glu He Glu Lys Leu Lys Met Glu He Ala Tyr Lys Asn Ser 
465 470 475 
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<210> 68 
<211> 1029 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein NLS-Ssv 

10 <220> 

<221> CDS 

<222> (1) . . (1026) 



15 



20 



<400> 68 

atg ccc aag aag aag agg aag gtg acg aaa gat aag acg cgt tat aaa 48 

Met Pro Lys Lys Lys Arg Lys Val Thr Lys Asp. Lys Thr Arg Tyr Lys 
1 5 io is 

tac ggg gat tat att tta cgc gag agg aaa ggg egg tat tat gtt tac 96 
Tyr Gly Asp Tyr lie Leu Arg Glu Arg Lys Gly Arg Tyr Tyr* Val Tyr 
20 25 30 

aag eta gag tat gaa aac ggt gag gta aaa gag cgt tac gtg ggt cct 144 
Lys Leu Glu Tyr Glu Asn Gly Glu Val Lys Glu Arg Tyr Val Gly Pro 
*J 35 40 45 

tta get gac gtc gtt gaa tea tat eta aaa atg aaa tta ggg gtc gta 192 
Leu Ala Asp Val Val Glu Ser Tyr Leu Lys Met Lys Leu Gly Val Val 
50 55 ^60 

ggg gat act ccc eta caa gcg gat ccc ccc ggt ttc gag ccc ggg aca 240 
Gly Asp Thr Pro Leu Gin Ala Asp Pro Pro Gly Phe Glu Pro Gly Thr 
65 70 75 80 

age gga age ggt ggt gga aaa gag gga act gaa cga cgt aaa ata gcg 288 
Ser Gly Ser Gly Gly Gly Lys Glu Gly Thr Glu Arg Arg Lys lie Ala 
85 90 95 

ttg gtt gee aat ttg cgc caa tac gcg acg gac ggc aac ata aag gcg 336 
Leu Val Ala Asn Leu Arg Gin Tyr Ala Thr Asp Gly Asn lie Lys Ala 
100 105 * HO 

ttc tac aac tat etc atg aac gaa agg ggg ata age gaa aaa act gca 384 
Phe Tyr Asn Tyr Leu Met Asn Glu Arg Gly He Ser Glu Lys Thr Ala 
45 us 120 125 

aag gac tac ate aat get ata tea aag ccg tat aaa gag acg aga gac 432 
Lys Asp Tyr He Asn Ala He Ser Lys Pro Tyr Lys Glu Thr Arg Asp 
130 135 140 

gca cag aag get tac cga etc ttt gca cgt ttc tta gcg tea cgc aat 480 
Ala Gin Lys Ala Tyr Arg Leu Phe Ala Arg Phe Leu Ala Ser Arq Asn 
145 150 155 160 

ate ata cat gat gaa ttt gcg gat aaa ata ttg aaa gcg gta aag gtg 528 
He He His Asp Glu Phe Ala Asp Lys He Leu Lys Ala Val Lys Val 
165 170 175 

aag aag gcg aac get gat ate tac att cca acg ttg gaa gag ata aaa 576 
Lys Lys Ala Asn Ala Asp He Tyr He Pro Thr Leu Glu Glu He Lys 
180 185 190 



30 



35 



40 



50 



55 



60 



agg acg tta caa tta gca aaa gac tat age gaa aac gtc tac ttc ate 624 

Arg Thr Leu Gin Leu Ala Lys Asp Tyr Ser Glu Asn Val Tyr Phe He 

65 195 200 205 

tac cgt ate get etc gag teg ggc gtt agg.ctg age gaa ata ctg aaa 672 
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Tyr Arg lie Ala Leu Glu Ser Gly Val Arg Leu Ser Glu lie Leu Lys 
210 215 220 

gtg ctg aag gaa ccc gaa agg gac att tgc ggt aac gac gtc tgt tat 720 

Val Leu Lys Glu Pro Glu Arg Asp He Cys Gly Asn Asp Val Cys Tyr 
225 2 30 235 240 

ttt til f tfc o 9t £ 9g aCt agg gga tat aag ggc gtc ttc tat gta ttc 768 

Tyr Pro Leu Ser Trp Thr Arg Gly Tyr Lys Gly Val Phe Tyr Val Phe 
245 250 



255 



h 9 o lu 9 £ ? 9 aag aga gta gag gtg acg aag tW gca ata gcg 816 
His He Thr Pro Leu Lys Arg Val Glu Val Thr Lys Trp Ala He Ala 
260 265 270 

gac ttt gaa cga cgt cat aag gac get ata gcg ata aag tac ttc cgc 864 
Asp Phe Glu Arg Arg His Lys Asp Ala He Ala He Lys Tyr Phe Arq 
275 280 285 

fit ll C ?, ta 9 ? 9 o Ct 399 atg gCt gag Cta agc gta ccg tta 9 at att 912 
Lys Phe Val Ala Ser Lys Met Ala Glu Leu Ser Val Pro Leu Asp He 

290 . 295 300 

ate gat ttt att caa ggg cgt aaa ccg aca cgc gtt tta acg caa cat 960 
lie Asp Phe He Gin Gly Arg Lys Pro Thr Arg Val Leu Thr Gin His 
305 310 315 320 

£ a ° ?, ta ^ Cg Ctc ttc ggc ata gcg aaa gag caa tat aaa aag tat gcg 1008 
Tyr Val Ser Leu Phe Gly He Ala Lys Glu Gin Tyr Lys Lys Tyr Ala 
325 330 " 



335 



gaa tgg cta aaa ggg gtc tga 
Glu Trp Leu Lys Gly Val 
340 



<210> 69 
<211> 342 
<212> PRT 
40 <213> Artificial Sequence 



<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein NLS-Ssv 



45 


Met 
1 


Pro Lys Lys 


Lys Arg Lys 
5 


Val Thr Lys Asp 
10 


Lys 


Thr 


Arg Tyr Lys 
15 


50 


Tyr 


Gly Asp Tyr 
20 


He Leu Arg 


Glu Arg Lys Gly 
25 


Arg 


Tyr 


Tyr Val Tyr 
30 




Lys 


Leu Glu Tyr 
35 


Glu Asn Gly 


Glu Val Lys Glu 
40 


Arg 


Tyr 
45 


Val Gly Pro 


55 


Leu 


Ala Asp Val 
50 


Val Glu Ser 
55 


Tyr Leu Lys Met 


Lys 
60 


Leu 


Gly Val Val 




Gly 
65 


Asp Thr Pro 


Leu Gin Ala 
70 


Asp Pro Pro Gly 
75 


Phe 


Glu 


Pro Gly Thr 
80 


60 


Ser 


Gly Ser Gly 


Gly Gly Lys 
85 


Glu Gly Thr Glu 
90 


Arg 


Arg 


Lys He Ala 
95 


65 


Leu 


Val Ala Asn 
100 


Leu Arg Gin 


Tyr Ala Thr Asp 
105 


Gly 


Asn 


He Lys Ala 
110 




Phe 


Tyr Asn Tyr 
115 


Leu Met Asn 


Glu Arg Gly He 
120 


Ser 


Glu 
125 


Lys Thr Ala 



1029 
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Lys Asp Tyr lie Asn Ala He Ser Lys Pro Tyr Lys Glu Thr Arg Asp 
130 135 14Q 

5 Ala Gin Lys Ala Tyr Arg Leu Phe Ala Arg Phe Leu Ala Ser Arg Asn 
145 150 155 160 

He He His Asp Glu Phe Ala Asp Lys He Leu Lys Ala Val Lys Val 
165 170 175 

Lys Lys Ala Asn Ala Asp He Tyr He Pro Thr Leu Glu Glu He Lys 
180 185 190 

Arg Thr Leu Gin Leu Ala Lys Asp Tyr Ser Glu Asn Val Tyr Phe He 
15 195 200 205 

Tyr Arg He Ala Leu Glu Ser Gly Val Arg Leu Ser Glu He Leu Lys 
210 215 220 

20 Val Leu Lys Glu Pro Glu Arg Asp He Cys Gly Asn Asp Val Cys Tyr 
225 230 235 ^ 240 

Tyr Pro Leu Ser Trp Thr Arg Gly Tyr Lys Gly Val Phe Tyr Val Phe 
25 . 24 5 250 255 

His He Thr Pro Leu Lys Arg Val Glu Val Thr Lys Trp Ala He Ala 
260 265 270 

Asp Phe Glu Arg Arg His Lys Asp Ala He Ala He Lys Tyr Phe Arg . 
30 275 280 285 

Lys Phe Val Ala Ser Lys Met Ala Glu Leu Ser Val Pro Leu Asp He 
290 295 300 

35 He- Asp Phe He Gin Gly Arg Lys Pro Thr Arg Val Leu Thr Gin His 
305 310 315 320 

Tyr Val Ser Leu Phe Gly He Ala Lys Glu Gin Tyr Lys Lys Tyr Ala 
325 330 ' 335 



40 



45 



50 



60 



65 



Glu Trp Leu Lys Gly Val 
340 



<210> 70 
<211> 3908 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: vector 



pBS-SSV3 
55 <400> 70 

cacctaaatt gtaagcgtta atattttgtt aaaattcgcg ttaaattttt gttaaatcag 60 
ctcatttttt aaccaatagg ccgaaatcgg caaaatccct tataaatcaa aagaatagac 120 
cgagataggg ttgagtgttg ttccagtttg gaacaagagt ccactattaa agaacgtgga 180 
ctccaacgtc aaagggcgaa aaaccgtcta tcagggcgat ggcccactac gtgaaccatc 240 
accctaatca agttttttgg ggtcgaggtg ccgtaaagca ctaaatcgga accctaaagg 300 
gagcccccga tttagagctt gacggggaaa gccggcgaac gtggcgagaa aggaagggaa 360 
gaaagcgaaa ggagcgggcg ctagggcgct ggcaagtgta gcggtcacgc tgcgcgtaac 420 
caccacaccc gccgcgctta atgcgccgct acagggcgcg tcccattcgc cattcaggct 4 80 
gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc agctggcgaa 540 
agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc agtcacgacg 600 
ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat tggagctcca 660 
ccgcggtggc ggccgcccga tatgacgaaa gataagacgc gttataaata cggggattat 720 
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attttacgcg agaggaaagg gcggtattat gtttacaagc tagagtatga aaacggtgag 780 
gtaaaagagc gttacgtggg tcctttagct gacgtcgttg aatcatatct aaaaatgaaa 840 
ttaggggtcg taggggatac tcccctacaa gcggatcccc ccggtttcga gcccgggaca 900 
agcggaagcg gtggtggaaa agagggaact gaacgacgta aaatagcgtt ggttgccaat 960 
ttgcgccaat acgcgacgga cggcaacata aaggcgttct acaactatct catgaacgaa 1020 
agggggataa gcgaaaaaac tgcaaaggac tacatcaatg ctatatcaaa gccgtataaa 1080 
gagacgagag acgcacagaa ggcttaccga ctctttgcac gtttcttagc gtcacgcaat 114 0 
H!£!«? C ? 2 at 9 aattt g c ggataaaata ttgaaagcgg taaaggtgaa gaaggcgaac 1200 
gctgatatct acattccaac gttggaagag ataaaaagga cgttacaatt agcaaaagac 1260 
tatagcgaaa acgtctactt catctaccgt atcgctctcg agtcgggcgt taggctgagc 1320 
gaaatactga aagtgctgaa ggaacccgaa agggacattt gcggtaacga cgtctgttat 1380 
tatccgctta gttggactag gggatataag ggcgtcttct atgtattcca cataacgcct 14 40 
ctgaagagag tagaggtgac gaagtgggca atagcggact ttgaacgacg tcataaggac 1500 
i c ^tatagcga taaagtactt ccgcaaattc gtagcgtcta agatggctga gctaagcgta 1560 
u ccgttagata ttatcgattt tattcaaggg cgtaaaccga cacgcgtttt aacgcaacat 1620 
tacgtatcgc tcttcggcat agcgaaagag caatataaaa agtatgcgga atggctaaaa 1680 
ggggtctgac tcgagggggg gcccggtacc cagcttttgt tccctttagt gagggttaat 1740 
ttcgagcttg gcgtaatcat ggtcatagct gtttcctgtg tgaaattgtt atccgctcac 1800 
aattccacac aacatacgag ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt 1860 
gagctaactc acattaattg cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc 1920 
gtgccagctg cattaatgaa tcggccaacg cgcggggaga ggcggtttgc gtattgggcg 1980 
ctcttccgct tcctcgctca ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt 2040 
atcagctcac tcaaaggcgg taatacggtt atccacagaa tcaggggata acgcaggaaa 2100 
Of gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc 2160 
ZJ gtttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag 2220 
gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt 2280 
gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg 2340 
aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg 2400 
Qtccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg 24 60 
taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac 2520 
tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg 2580 . 
gcctaactac ggctacacta gaaggacagt atttggtatc tgcgctctgc tgaagccagt 2640 
taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg 2700 
IS 111 < 3 ttt ^ caa< 3 c agcagattac.gcgcagaaaa aaaggatctc aagaagatcc 2760 

tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt 2820 
ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt 2880 
taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag 2940 
tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt 3000 
cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg caatgatacc 3060 
W gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc 3120 
cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg 3180 
ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac 3240 
aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg 3300 
atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc 3360 
tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact 3420 
gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc 3480 
aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat 3540 
acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc 3600 
ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac 3660 
tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa 3720 
aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact 3780 
catactcttc ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg 3840 
atacatattt gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg 3900 
55 aaaa ^g c 3908 

<210> 71 
<211> 3927 
<212> DNA 
60 <213> Artificial Sequence 



30 



65 



<220> 

<223> Description of Artificial Sequence- vector 
pBS-SSV4 

<400> 71 

cacctaaatt gtaagcgtta atattttgtt aaaattcgcg ttaaatt-ttt gttaaatcag 60 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT/EP01/12975 



ctcatttttt 
cgagataggg 
ctccaacgtc 
accctaatca 
5 gagcccccga 
gaaagcgaaa 
caccacaccc 
gcgcaactgt 
agggggatgt 
10 ttgtaaaacg 
ccgcggtggc 
ttataaatac 
agagtatgaa 
atcatatcta 

15 cggtttcgag 
aatagcgttg 
caactatctc 
tatatcaaag 
tttcttagcg 

20 aaaggtgaag 
gttacaatta 
gtcgggcgtt 
cggtaacgac 
tgtattccac 

25 tgaacgacgt 
gatggctgag 
acgcgtttta 
gtatgcggaa 
ccctttagtg 

30 gaaattgtta 
cctggggtgc 
tccagtcggg 
gcggtttgcg 
ttcggctgcg 

35 caggggataa 
aaaaggccgc 
atcgacgctc 
cccctggaag 
ccgcctttct 

40 gttcggtgta 
accgctgcgc 
cgccactggc 
cagagttctt 
gcgctctgct 

45 aaaccaccgc 
aaggatctca 
actcacgtta 
taaattaaaa 
gttaccaatg 

50 tagttgcctg 
ccagtgctgc 
accagccagc 
agtctattaa 
acgttgttgc 

55 tcagctccgg 
cggttagctc 
tcatggttat 
ctgtgactgg 
gctcttgccc 

60 tcatcattgg 
ccagttcgat 
gcgtttctgg 
cacggaaatg 
gttattgtct 

65 ttccgcgcac 



aaccaatagg 

ttgagtgttg 

aaagggcgaa 

agttttttgg 

tttagagctt 

ggagcgggcg 

gccgcgctta 

tgggaagggc 

gctgcaaggc 

acggccagtg 

ggccgcacca 

ggggattata 

aacggtgagg 

aaaatgaaat 

cccgggacaa 

gttgccaatt 

atgaacgaaa 

ccgtataaag 

tcacgcaata 

aaggcgaacg 

gcaaaagact 

aggctgagcg 

gtctgttatt 

ataacgcctc 

cataaggacg 

ctaagcgtac 

acgcaacatt 

tggctaaaag 

agggttaatt 

tccgctcaca 

ctaatgagtg 

aaacctgtcg 

tattgggcgc 

gcgagcggta 

cgcaggaaag 

gttgctggcg 

aagtcagagg 

ctccctcgtg 

cccttcggga 

ggtcgttcgc 

cttatccggt 

agcagccact 

gaagtggtgg 

gaagccagtt 

tggtagcggt 

agaagatcct 

agggattttg 

atgaagtttt 

cttaatcagt 

actccccgtc 

aatgataccg 

cggaagggcc 

ttgttgccgg 

cattgctaca 

ttcccaacga 

cttcggtcct 

ggcagcactg 

tgagtactca 

ggcgtcaata 

aaaacgttct 

gtaacccact 

gtgagcaaaa 

ttgaatactc 

catgagcgga 

atttccccga 



59 

ccgaaatcgg caaaatccct tataaatcaa 
ttccagtttg gaacaagagt ccactattaa 
aaaccgtcta tcagggcgat ggcccactac 
ggtcgaggtg ccgtaaagca ctaaatcgga 
gacggggaaa gccggcgaac gtggcgagaa 
ctagggcgct ggcaagtgta gcggtcacgc 
atgcgccgct acagggcgcg tcccattcgc 
gatcggtgcg ggcctcttcg ctattacgcc 
gattaagttg ggtaacgcca gggttttccc 
aattgtaata cgactcacta tagggcgaat 
tgcccaagaa gaagaggaag gtgacgaaag 
ttttacgcga gaggaaaggg cggtattatg 
taaaagagcg ttacgtgggt cctttagctg 
taggggtcgt aggggatact cccctacaag 
gcggaagcgg tggtggaaaa gagggaactg 
tgcgccaata cgcgacggac ggcaacataa 
gggggataag cgaaaaaact gcaaaggact 
agacgagaga cgcacagaag gcttaccgac 
tcatacatga tgaatttgcg gataaaatat 
ctgatatcta cattccaacg ttggaagaga 
atagcgaaaa cgtctacttc atctaccgta 
aaatactgaa agtgctgaag gaacccgaaa 
atccgcttag ttggactagg ggatataagg 
tgaagagagt agaggtgacg aagtgggcaa 
ctatagcgat aaagtacttc cgcaaattcg 
cgttagatat tatcgatttt attcaagggc 
acgtatcgct cttcggcata gcgaaagagc 
gggtctgact cgaggggggg cccggtaccc 
tcgagcttgg cgtaatcatg gtcatagctg 
attccacaca acatacgagc cggaagcata 
agctaactca cattaattgc gttgcgctca 
tgccagctgc attaatgaat cggccaacgc 
tcttccgctt cctcgctcac tgactcgctg 
tcagctcact caaaggcggt aatacggtta 
aacatgtgag caaaaggcca gcaaaaggcc 
tttttccata ggctccgccc ccctgacgag 
tggcgaaacc cgacaggact ataaagatac 
cgctctcctg ttccgaccct gccgcttacc 
agcgtggcgc tttctcatag ctcacgctgt 
tccaagctgg gctgtgtgca cgaacccccc 
aactatcgtc ttgagtccaa cccggtaaga 
ggtaacagga ttagcagagc gaggtatgta 
cctaactacg gctacactag aaggacagta 
accttcggaa aaagagttgg tagctcttga 
ggtttttttg tttgcaagca gcagattacg 
ttgatctttt ctacggggtc tgacgctcag 
gtcatgagat tatcaaaaag gatcttcacc 
aaatcaatct aaagtatata tgagtaaact 
gaggcaccta tctcagcgat ctgtctattt 
gtgtagataa ctacgatacg ggagggctta 
cgagacccac gctcaccggc tccagattta 
gagcgcagaa gtggtcctgc aactttatcc 
gaagctagag taagtagttc gccagttaat 
ggcatcgtgg tgtcacgctc gtcgtttggt 
tcaaggcgag ttacatgatc ccccatgttg 
ccgatcgttg tcagaagtaa gttggccgca 
cataattctc ttactgtcat gccatccgta 
accaagtcat tctgagaata gtgtatgcgg 
cgggataata ccgcgccaca tagcagaact 
tcggggcgaa aactctcaag gatcttaccg 
cgtgcaccca actgatcttc agcatctttt 
acaggaaggc aaaatgccgc aaaaaaggga 
atactcttcc tttttcaata ttattgaagc 
tacatatttg aatgtattta gaaaaataaa 
aaagtgc 



aagaatagac 
agaacgtgga 
gtgaaccatc 
accctaaagg 
aggaagggaa 
tgcgcgtaac 
cattcaggct 
agctggcgaa 
agtcacgacg 
tggagctcca 
ataagacgcg 
tttacaagct 
acgtcgttga 
cggatccccc 
aacgacgtaa 
aggcgttcta 
acatcaatgc 
tctttgcacg 
tgaaagcggt 
taaaaaggac 
tcgctctcga 
gggacatttg 
gcgtcttcta 
tagcggactt 
tagcgtctaa 
gtaaaccgac 
aatataaaaa 
agcttttgtt 
tttcctgtgt 
aagtgtaaag 
ctgcccgctt 
gcggggagag 
cgctcggtcg 
tccacagaat 
aggaaccgta 
catcacaaaa 
caggcgtttc 
ggatacctgt 
aggtatctca 
gttcagcccg 
cacgacttat 
ggcggtgcta 
tttggtatct 
tccggcaaac 
cgcagaaaaa 
tggaacgaaa 
tagatccttt 
tggtctgaca 
cgttcatcca 
ccatctggcc 
tcagcaataa 
gcctccatcc 
agtttgcgca 
atggcttcat 
tgcaaaaaag 
gtgttatcac 
agatgctttt 
cgaccgagtt 
ttaaaagtgc 
ctgttgagat 
actttcacca 
ataagggcga 
atttatcagg 
caaatagggg 



120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3927 
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<210> 72 
<211> 3351 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pBS-SSVs 
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15 



20 



25 



30 



10 <400> 72 
cgacctcgag 
tgccgaagag 
aatcgataat 
agtactttat 
tcacctctac 
tagtccaact 
tcagcacttt 
gcttttgttc 
ttcctgtgtg 
agtgtaaagc 
tgcccgcttt 
cggggagagg 
gctcggtcgt 
ccacagaatc 
ggaaccgtaa 
atcacaaaaa 
aggcgtttcc 
gatacctgtc 
ggtatctcag 
ttcagcccga 
acgacttatc 
gcggtgctac 
ttggtatctg 
ccggcaaaca 
35 gcagaaaaaa 
ggaacgaaaa 
agatcctttt 
ggtctgacag 
gttcatccat 
40 catctggccc 
cagcaataaa 
cctccatcca 
gtttgcgcaa 
tggcttcatt 
gcaaaaaagc 
tgttatcact 
gatgcttttc 
gaccgagttg 
taaaagtgct 
tgttgagatc 
ctttcaccag 
taagggcgac 
tttatcaggg 
aaataggggt 
tttgttaaaa 
aatcggcaaa 
agtttggaac 
cgtctatcag 
gaggtgccgt 
gggaaagccg 
ggcgctggca 
gccgctacag 
ggtgcgggcc 
aagttgggta 
gtaatacgac 
tagtggatcc 



45 



50 



55 



60 



65 



tcagacccct 
cgatacgtaa 
atctaacggt 
cgctatagcg 
tctcttcaga 
aagcggataa 
cagtatttcg 
cctttagtga 
aaattgttat 
ctggggtgcc 
ccagtcggga 
cggtttgcgt 
tcggctgcgg 
aggggataac 
aaaggccgcg 
tcgacgctca 
ccctggaagc 
cgcctttctc 
ttcggtgtag 
ccgctgcgcc 
gccactggca 
agagttcttg 
cgctctgctg 
aaccaccgct 
aggatctcaa 
ctcacgttaa 
aaattaaaaa 
ttaccaatgc 
agttgcctga 
cagtgctgca 
ccagccagcc 
gtctattaat 
cgttgttgcc 
cagctccggt 
ggttagctcc 
catggttatg 
tgtgactggt 
ctcttgcccg 
catcattgga 
cagttcgatg 
cgtttctggg 
acggaaatgt 
ttattgtctc 
tccgcgcaca 
ttcgcgttaa 
atcccttata 
aagagtccac 
ggcgatggcc 
aaagcactaa 
gcgaacgtgg 
agtgtagcgg 
ggcgcgtccc 
tcttcgctat 
acgccagggt 
tcactatagg 
cccgggctgc 



tttagccatt 
tgttgcgtta 
acgcttagct 
tccttatgac 
ggcgttatgt 
taacagacgt 
ctcagcctaa 
gggttaattt 
ccgctcacaa 
taatgagtga 
aacctgtcgt 
attgggcgct 
cgagcggtat 
gcaggaaaga 
ttgctggcgt 
agtcagaggt 
tccctcgtgc 
ccttcgggaa 
gtcgttcgct 
ttatccggta 
gcagccactg 
aagtggtggc 
aagccagtta 
ggtagcggtg 
gaagatcctt 
gggattttgg 
tgaagtttta 
ttaatcagtg 
ctccccgtcg 
atgataccgc 
ggaagggccg 
tgttgccggg 
attgctacag 
tcccaacgat 
ttcggtcctc 
gcagcactgc 
gagtactcaa 
gcgtcaatac 
aaacgttctt 
taacccactc 
tgagcaaaaa 
tgaatactca 
atgagcggat 
tttccccgaa 
atttttgtta 
aatcaaaaga 
tattaaagaa 
cactacgtga 
atcggaaccc 
cgagaaagga 
tcacgctgcg 
attcgccatt 
tacgccagct 
tttcccagtc 
gcgaattgga 
aggaattcga 



ccgcatactt 
aaacgcgtgt 
cagccatctt 
gtcgttcaaa 
ggaatacata 
cgttaccgca 
cgcccgactc 
cgagcttggc 
ttccacacaa 
gctaactcac 
gccagctgca 
cttccgcttc 
cagctcactc 
acatgtgagc 
ttttccatag 
ggcgaaaccc 
gctctcctgt 
gcgtggcgct 
ccaagctggg 
actatcgtct 
gtaacaggat 
ctaactacgg 
ccttcggaaa 
gtttttttgt 
tgatcttttc 
tcatgagatt 
aatcaatcta 
aggcacctat 
tgtagataac 
gagacccacg 
agcgcagaag 
aagctagagt 
gcatcgtggt 
caaggcgagt 
cgatcgttgt 
ataattctct 
ccaagtcatt 
gggataatac 
cggggcgaaa 
gtgcacccaa 
caggaaggca 
tactcttcct 
acatatttga 
aagtgccacc 
aatcagctca 
atagaccgag 
cgtggactcc 
accatcaccc 
taaagggagc 
agggaagaaa 
cgtaaccacc 
caggctgcgc 
ggcgaaaggg 
acgacgttgt 
gctccaccgc 
tatcaagctt 



tttatattgc 
cggtttacgc 
agacgctacg 
gtccgctatt 
gaagacgccc 
aatgtccctt 
gagggggggc 
gtaatcatgg 
catacgagcc 
attaattgcg 
ttaatgaatc 
ctcgctcact 
aaaggcggta 
aaaaggccag 
gctccgcccc 
gacaggacta 
tccgaccctg 
ttctcatagc 
ctgtgtgcac 
tgagtccaac 
tagcagagcg 
ctacactaga 
aagagttggt 
ttgcaagcag 
tacggggtct 
atcaaaaagg 
aagtatatat 
ctcagcgatc 
tacgatacgg 
ctcaccggct 
tggtcctgca 
aagtagttcg 
gtcacgctcg 
tacatgatcc 
cagaagtaag 
tactgtcatg 
ctgagaatag 
cgcgccacat 
actctcaagg 
ctgatcttca 
aaatgccgca 
ttttcaatat 
atgtatttag 
taaattgtaa 
ttttttaacc 
atagggttga 
aacgtcaaag 
taatcaagtt 
ccccgattta 
gcgaaaggag 
acacccgccg 
aactgttggg 
ggatgtgctg 
aaaacgacgg 
ggtggcggcc 
atcgataccg 



tctttcgcta 
ccttgaataa 
aatttgcgga 
gcccacttcg 
ttatatcccc 
tcgggttcct 
ccggtaccca 
tcatagctgt 
ggaagcataa 
ttgcgctcac 
ggccaacgcg 
gactcgctgc 
atacggttat 
caaaaggcca 
cctgacgagc- 
taaagatacc 
ccgcttaccg 
tcacgctgta 
gaaccccccg 
ccggtaagac 
aggtatgtag 
aggacagtat 
agctcttgat 
cagattacgc 
gacgctcagt 
atcttcacct 
gagtaaactt 
tgtctatttc 
gagggcttac 
ccagatttat 
actttatccg 
ccagttaata 
tcgtttggta 
cccatgttgt 
ttggccgcag 
ccatccgtaa 
tgtatgcggc 
agcagaactt 
atcttaccgc 
gcatctttta 
aaaaagggaa 
tattgaagca 
aaaaataaac 
gcgttaatat 
aataggccga 
gtgttgttcc 
ggcgaaaaac 
ttttggggtc 
gagcttgacg 
cgggcgctag 
cgcttaatgc 
aagggcgatc 
caaggcgatt 
ccagtgaatt 
gctctagaac 
t 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 • 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3351 
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<210> 73 
<211> 5730 
<212> DNA 
5 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
pCMVC31(NNLS) 

<400> 73 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 

15 f£:!23*f - f^gcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 
atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 

M ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 

^ cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 
cagggcggcc gcaccatgcc caagaagaag aggaaggtga cacaaggggt tgtgaccggg 1020 
gtggacacgt acgcgggtgc ttacgaccgt cagtcgcgcg agcgcgagaa ttcgagcgca 1080 

JU gcaagcccag cgacacagcg tagcgccaac gaagacaagg cggccgacct tcagcgcgaa 1140 
gtcgagcgcg acgggggccg gttcaggttc gtcgggcatt tcagcgaagc gccgggcacg 1200 
tcggcgttcg ggacggcgga gcgcccggag ttcgaacgca tcctgaacga atgccgcgcc 1260 
gggcggctca acatgatcat tgtctatgac gtgtcgcgct tctcgcgcct gaaggtcatg 1320 
gacgcgattc cgattgtctc ggaattgctc gccctgggcg tgacgattgt ttccactcag 1380 

n gaaggcgtct tccggcaggg aaacgtcatg gacctgattc acctgattat gcggctcgac 1440 
gcgtcgcaca aagaatcttc gctgaagtcg gcgaagattc tcgacacgaa gaaccttcag 1500 
cgcgaattgg gcgggtacgt cggcgggaag gcgccttacg gcttcgagct tgtttcggag 1560 
acgaaggaga tcacgcgcaa cggccgaatg gtcaatgtcg tcatcaacaa gcttgcgcac 1620 
tcgaccactc cccttaccgg acccttcgag ttcgagcccg acgtaatccg gtggtggtgg 1680 

W cgtgagatca agacgcacaa acaccttccc ttcaagccgg gcagtcaagc cgccattcac 1740 
ccgggcagca tcacggggct ttgtaagcgc atggacgctg acgccgtgcc gacccggggc 1800 
gagacgattg ggaagaagac cgcttcaagc gcctgggacc cggcaaccgt tatgcgaatc 1860 
cttcgggacc cgcgtattgc gggcttcgcc gctgaggtga tctacaagaa gaagccggac 1920 
ggcacgccga ccacgaagat tgagggttac cgcattcagc gcgacccgat cacgctccgg 1980 
ccggtcgagc ttgattgcgg accgatcatc gagcccgctg agtggtatga gcttcaggcg 2040 
tggttggacg gcagggggcg cggcaagggg ctttcccggg ggcaagccat tctgtccgcc 2100 
atggacaagc tgtactgcga gtgtggcgcc gtcatgactt cgaagcgcgg ggaagaatcg 2160 
atcaaggact cttaccgctg ccgtcgccgg aaggtggtcg acccgtccgc acctgggcag 2220 
cacgaaggca cgtgcaacgt cagcatggcg gcactcgaca agttcgttgc ggaacgcatc 2280 

JU "caacaaga tcaggcacgc cgaaggcgac gaagagacgt tggcgcttct gtgggaagcc 234 0 
gcccgacgct tcggcaagct cactgaggcg cctgagaaga gcggcgaacg ggcgaacctt 2400 
gttgcggagc gcgccgacgc cctgaacgcc cttgaagagc tgtacgaaga ccgcgcggca 2460 
ggcgcgtacg acggacccgt tggcaggaag cacttccgga agcaacaggc agcgctgacg 2520 
ctccggcagc aaggggcgga agagcggctt gccgaacttg aagccgccga agccccgaag 2580 
cttccccttg accaatggtt ccccgaagac gccgacgctg acccgaccgg ccctaagtcg 2640 
tggtgggggc gcgcgtcagt agacgacaag cgcgtgttcg tcgggctctt cgtagacaag 2700 
atcgttgtca cgaagtcgac tacgggcagg gggcagggaa cgcccatcga gaagcgcgct 2760 
tcgatcacgt gggcgaagcc gccgaccgac gacgacgaag acgacgccca ggacggcacg 2820 
gaagacgtag cggcgtaggc ggcgcccggg ctcgagatcc aggcgcggat caataaaaga 2880 
OU tcattatttt caatagatct gtgtgttggt tttttgtgtg ccttggggga gggggaggcc 2940 
agaatgaggc gcggccaagg gggaggggga ggccagaatg accttggggg agggggaggc 3000 
cagaatgacc ttgggggagg gggaggccag aatgaggcgc gcccccgggt accgagctcg 3060 
aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt tacccaactt 3120 
aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga ggcccgcacc 3180 
It tcg ? cctt cccaacagtt gcgcagcctg aatggcgaat ggcgcctgat gcggtatttt 3240 
ctccttacgc atctgtgcgg tatttcacac cgcatatggt gcactctcag tacaatctgc 3300 
tctgatgccg catagttaag ccagccccga cacccgccaa cacccgctga cgcgccctga 3360 
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cgggcttgtc 
atgtgtcaga 
cgcctatttt 
tttcggggaa 
5 tatccgctca 
atgagtattc 
gtttttgctc 
cgagtgggtt 
gaagaacgtt 

10 cgtattgacg 
gttgagtact 
tgcagtgctg 
ggaggaccga 
gatcgttggg 

1j cctgtagcaa 
tcccggcaac 
tcggcccttc 
cgcggtatca 
acgacgggga 

20 tcactgatta 
ttaaaacttc 
accaaaatcc 
aaaggatctt 
ccaccgctac 

25 gtaactg.gct 
ggccaccact 
ccagtggctg 
ttaccggata 
gagcgaacga 

30 cttcccgaag 
cgcacgaggg 
cacctctgac 
aacgccagca 
ttctttcctg 

35 gataccgctc 
gagcgcccaa 
cacgacaggt 
ctcactcatt 
attgtgagcg 

40 gcccgggcta 



tgctcccggc 
ggttttcacc 
tataggttaa 
atgtgcgcgg 
tgagacaata 
aacatttccg 
acccagaaac 
acatcgaact 
ttccaatgat 
ccgggcaaga 
caccagtcac 
ccataaccat 
aggagctaac 
aaccggagct 
tggcaacaac 
aattaataga 
cggctggctg 
ttgcagcact 
gtcaggcaac 
agcattggta 
atttttaatt 
cttaacgtga 
cttgagatcc 
cagcggtggt 
tcagcagagc 
tcaagaactc 
ctgccagtgg 
aggcgcagcg 
cctacaccga 
ggagaaaggc 
agcttccagg 
ttgagcgtcg 
acgcggcctt 
cgttatcccc 
gccgcagccg 
tacgcaaacc 
ttcccgactg 
aggcacccca 
gataacaatt 
gcttgcatgc 



atccgcttac 
gtcatcaccg 
tgtcatgata 
aacccctatt 
accctgataa 
tgtcgccctt 
gctggtgaaa 
ggatctcaac 
gagcactttt 
gcaactcggt 
agaaaagcat 
gagtgataac 
cgcttttttg 
gaatgaagcc 
gttgcgcaaa 
ctggatggag 
gtttattgct 
ggggccagat 
tatggatgaa 
actgtcagac 
taaaaggatc 
gttttcgttc 
tttttttctg 
ttgtttgccg 
gcagatacca 
tgtagcaccg 
cgataagtcg 
gtcgggctga 
actgagatac 
ggacaggtat 
gggaaacgcc 
atttttgtga 
tttacggttc 
tgattctgtg 
aacgaccgag 
gcctctcccc 
gaaagcgggc 
ggctttacac 
tcacacagga 
ctgcaggttt 



62 

agacaagctg 
aaacgcgcga 
ataatggttt 
tgtttatttt 
atgcttcaat 
attccctttt 
gtaaaagatg 
agcggtaaga 
aaagttctgc 
cgccgcatac 
cttacggatg 
actgcggcca 
cacaacatgg 
ataccaaacg 
ctattaactg 
gcggataaag 
gataaatctg 
ggtaagccct 
cgaaatagac 
caagtttact 
taggtgaaga 
cactgagcgt 
cgcgtaatct 
gatcaagagc 
aatactgtcc 
cctacatacc 
tgtcttaccg 
acggggggtt 
ctacagcgtg 
ccggtaagcg 
tggtatcttt 
tgctcgtcag 
ctggcctttt 
gataaccgta 
cgcagcgagt 
gcgcgttggc 
agtgagcgca 
tttatgcttc 
aacagctatg 



tgaccgtctc 
gacgaaaggg 
cttagacgtc 
tctaaataca 
aatattgaaa 
ttgcggcatt 
ctgaagatca 
tccttgagag 
tatgtggcgc 
actattctca 
gcatgacagt 
acttacttct 
gggatcatgt 
acgagcgtga 
gcgaactact 
ttgcaggacc 
gagccggtga 
cccgtatcgt 
agatcgctga 
catatatact 
tcctttttga 
cagaccccgt 
gctgcttgca 
taccaactct 
ttctagtgta 
tcgctctgct 
ggttggactc 
cgtgcacaca 
agctatgaga 
gcagggtcgg 
atagtcctgt 
gggggcggag 
gctggccttt 
ttaccgcctt 
cagtgagcga 
cgattcatta 
acgcaattaa 
cggctcgtat 
accatgatta 



cgggagctgc 
cctcgtgata 
aggtggcact 
ttcaaatatg 
aaggaagagt 
ttgccttcct 
gttgggtgca 
ttttcgcccc 
ggtattatcc 
gaatgacttg 
aagagaatta 
gacaacgatc 
aactcgcctt 
caccacgatg 
tactctagct 
acttctgcgc 
gcgtgggtct 
agttatctac 
gataggtgcc 
ttagattgat 
taatctcatg 
agaaaagatc 
aacaaaaaaa 
ttttccgaag 
gccgtagtta 
aatcctgtta 
aagacgatag 
gcccagcttg 
aagcgccacg 
aacaggagag 
cgggtttcgc 
cctatggaaa 
tgctcacatg 
tgagtgagct 
ggaagcggaa 
atgcagctgg 
tgtgagttag 
gttgtgtgga 
cgccaagcta 



3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5730 
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55 



60 



65 



vector 



<400> 74 

aaacagtccg 

tagtaatcaa 

cttacggtaa 

atgacgtatg 

tatttacggt 

cctattgacg 

tgggactttc 

cggttttggc 

ctccacccca 

aaatgtcgta 

gtctatataa 

attaatacga 

tgagtactcc 

cgaggaggat 

ctggtcagaa 



atgtacgggc 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctct 
ctcactatag 
ctctcaaaag 
ttgatattca 
aagacaatct 



cagatatacg 
attagttcat 
tggctgaccg 
aacgccaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
ctggctaact 
ggagacccaa 
cgggcatgac 
cctggcccgc 
ttttgttgtc 



cgttgacatt 
agcccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gcctggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
agagaaccca 
gctgactcta 
ttctgcgcta 
ggtgatgcct 
aagcttgagg 



gattattgac 
tggagttccg 
cccgcccatt 
attgacgt ca 
atcatatgcc 
atgcccagta 
tcgctattac 
actcacgggg 
aaaatcaacg 
gtaggcgtgt 
ctgcttactg 
gacttaatta 
agattgtcag 
ttgagggtgg 
tgtggcaggc 



tagttattaa 60 
cgttacataa 120 
gacgtcaata 180 
atgggtggac 240 
aagtacgccc 300 
catgacctta 360 
catggtgatg 420 
atttccaagt 480 
ggactttcca 540 
acggtgggag 600 
gcttatcgaa 660 
agcgttgggg 720 
tttccaaaaa 780 
ccgcgtccat 8 40 
ttgagatctg 900 
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gccatacact tgagtgacat tgacatccac 
cagggcggcc gcccgatatg acgaaagata 
tacgcgagag gaaagggcgg tattatgttt 
aagagcgtta cgtgggtcct ttagctgacg 
5 gggtcgtagg ggatactccc ctacaagcgg 
gaagcggtgg tggaaaagag ggaactgaac 
gccaatacgc gacggacggc aacataaagg 
ggataagcga aaaaactgca aaggactaca 
cgagagacgc acagaaggct taccgactct 

10 tacatgatga atttgcggat aaaatattga 
atatctacat tccaacgttg gaagagataa 
gcgaaaacgt ctacttcatc taccgtatcg 
tactgaaagt gctgaaggaa cccgaaaggg 
cgcttagttg gactagggga tataagggcg 

15 agagagtaga ggtgacgaag tgggcaatag 
tagcgataaa gtacttccgc aaattcgtag 
tagatattat cgattttatt caagggcgta 
tatcgctctt cggcatagcg aaagagcaat 
tctgactcga gggggggccc gtcgacctcg 

20 tattttcaat agatctgtgt gttggttttt 
tgaggcgcgg ccaaggggga gggggaggcc 
atgaccttgg gggaggggga ggccagaatg 
cactggccgt cgttttacaa cgtcgtgact 
gccttgcagc acatccccct ttcgccagct 

25 gcccttccca acagttgcgc agcctgaatg 
ttacgcatct gtgcggtatt tcacaccgca 
atgccgcata gttaagccag ccccgacacc 
cttgtctgct cccggcatcc gcttacagac 
gtcagaggtt ttcaccgtca tcaccgaaac 

30 tatttttata ggttaatgtc atgataataa 
ggggaaatgt gcgcggaacc cctatttgtt 
cgctcatgag acaataaccc tgataaatgc 
gtattcaaca tttccgtgtc gcccttattc 
ttgctcaccc agaaacgctg gtgaaagtaa 

35 tgggttacat cgaactggat ctcaacagcg 
aacgttttcc aatgatgagc acttttaaag 
ttgacgccgg gcaagagcaa ctcggtcgcc 
agtactcacc agtcacagaa aagcatctta 
gtgctgccat aaccatgagt gataacactg 

40 gaccgaagga gctaaccgct tttttgcaca 
gttgggaacc ggagctgaat gaagccatac 
tagcaatggc aacaacgttg cgcaaactat 
ggcaacaatt aatagactgg atggaggcgg 
cccttccggc tggctggttt attgctgata 

45 gtatcattgc agcactgggg ccagatggta 
cggggagtca ggcaactatg gatgaacgaa 
tgattaagca ttggtaactg tcagaccaag 
aacttcattt ttaatttaaa aggatctagg 
aaatccctta acgtgagttt tcgttccact 

50 gatcttcttg agatcctttt tttctgcgcg 
cgctaccagc ggtggtttgt ttgccggatc 
ctggcttcag cagagcgcag ataccaaata 
accacttcaa gaactctgta gcaccgccta 
tggctgctgc cagtggcgat aagtcgtgtc 

55 cggataaggc gcagcggtcg ggctgaacgg 
gaacgaccta caccgaactg agatacctac 
ccgaagggag aaaggcggac aggtatccgg 
cgagggagct tccaggggga aacgcctggt 
tctgacttga gcgtcgattt ttgtgatgct 

60 ccagcaacgc ggccttttta cggttcctgg 
ttcctgcgtt atcccctgat tctgtggata 
ccgctcgccg cagccgaacg accgagcgca 
gcccaatacg caaaccgcct ctccccgcgc 
acaggtttcc cgactggaaa gcgggcagtg 

65 ctcattaggc accccaggct ttacacttta 
tgagcggata acaatttcac acaggaaaca 
gggctagctt gcatgcctgc aggttt 
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tttgcctttc tctccacagg tgtccactcc 960 
agacgcgtta taaatacggg gattatattt 1020 
acaagctaga gtatgaaaac ggtgaggtaa 1080 
tcgttgaatc atatctaaaa atgaaattag 1140 
atccccccgg tttcgagccc gggacaagcg 1200 
gacgtaaaat agcgttggtt gccaatttgc 1260 
cgttctacaa ctatctcatg aacgaaaggg 1320 
tcaatgctat atcaaagccg tataaagaga 1380 
ttgcacgttt cttagcgtca cgcaatatca 1440 
aagcggtaaa ggtgaagaag gcgaacgctg 1500 
aaaggacgtt acaattagca aaagactata 1560 
ctctcgagtc gggcgttagg ctgagcgaaa 1620 
acatttgcgg taacgacgtc tgttattatc 1680 
tcttctatgt attccacata acgcctctga 1740 
cggactttga acgacgtcat aaggacgcta 1800 
cgtctaagat ggctgagcta agcgtaccgt 1860 
aaccgacacg cgttttaacg caacattacg 1920 
ataaaaagta tgcggaatgg ctaaaagggg 1980 
agatccaggc gcggatcaat aaaagatcat 2040 
tgtgtgcctt gggggagggg gaggccagaa 2100 
agaatgacct tgggggaggg ggaggccaga 2160 
aggcgcgccc ccgggtaccg agctcgaatt 2220 
gggaaaaccc tggcgttacc caacttaatc 2280 
ggcgtaatag cgaagaggcc cgcaccgatc 2340 
gcgaatggcg cctgatgcgg tattttctcc 2400 
tatggtgcac tctcagtaca atctgctctg 2460 
cgccaacacc cgctgacgcg ccctgacggg 2520 
aagctgtgac cgtctccggg agctgcatgt 2580 
gcgcgagacg aaagggcctc gtgatacgcc 2640 
tggtttctta gacgtcaggt ggcacttttc 2700 
tatttttcta aatacattca aatatgtatc 2760 
ttcaataata ttgaaaaagg aagagtatga 2820 
ccttttttgc ggcattttgc cttcctgttt 2880 
aagatgctga agatcagttg ggtgcacgag 2940 
gtaagatcct tgagagtttt cgccccgaag 3000 
ttctgctatg tggcgcggta ttatcccgta 3060 
gcatacacta ttctcagaat gacttggttg 3120 
cggatggcat gacagtaaga gaattatgca 3180 
cggccaactt acttctgaca acgatcggag 3240 
acatggggga tcatgtaact cgccttgatc 3300 
caaacgacga gcgtgacacc acgatgcctg 3360 
taactggcga actacttact ctagcttccc 3420 
ataaagttgc aggaccactt ctgcgctcgg 3480 
aatctggagc cggtgagcgt gggtctcgcg 3540 
agccctcccg tatcgtagtt atctacacga 3600 
atagacagat cgctgagata ggtgcctcac 3660 
tttactcata tatactttag attgatttaa 3720 
tgaagatcct ttttgataat ctcatgacca 3780 
gagcgtcaga . ccccgtagaa aagatcaaag 3840 
taatctgctg cttgcaaaca aaaaaaccac 3900 
aagagctacc aactcttttt ccgaaggtaa 3960 
ctgtccttct agtgtagccg tagttaggcc 4020 
catacctcgc tctgctaatc ctgttaccag 4 080 
ttaccgggtt ggactcaaga cgatagttac 4140 
ggggttcgtg cacacagccc agcttggagc 4200 
agcgtgagct atgagaaagc gccacgcttc 4260 
taagcggcag ggtcggaaca ggagagcgca 4320 
atctttatag tcctgtcggg tttcgccacc 4380 
cgtcaggggg gcggagccta tggaaaaacg 4 440 
ccttttgctg gccttttgct cacatgttct 4500 
accgtattac cgcctttgag tgagctgata 4560 
gcgagtcagt gagcgaggaa gcggaagagc 4 620 
gttggccgat tcattaatg.c agctggcacg 4 680 
agcgcaacgc aattaatgtg agttagctca 47 40 
tgcttccggc tcgtatgttg tgtggaattg 4800 
gctatgacca tgattacgcc aagctagccc 4 8*60 

4886 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT7EP01/12975 



64 



10 



<210> 75 
<211> 4905 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pCMV-SSV(NNLS) 



<400> 75 

aaacagtccg 

tagtaatcaa 

15 cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 
tgggactttc 

20 cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
attaatacga 

25 tgagtactcc 
cgaggaggat 
ctggtcagaa 
gccatacact 
cagggcggcc 

30 aaatacgggg 
tatgaaaacg 
tatctaaaaa 
ttcgagcccg 
gcgttggttg 

35 tatctcatga 
tcaaagccgt 
ttagcgtcac 
gtgaagaagg 
caattagcaa 

40 ggcgttaggc 
aacgacgtct 
ttccacataa 
cgacgtcata 
gctgagctaa 

45 gttttaacgc 
gcggaatggc 
cggatcaata 
ggggaggggg 

gggggagggg 

50 cgggtaccga 
ggcgttaccc 
gaagaggccc 
ctgatgcggt 
ctcagtacaa 

55 gctgacgcgc 
gtctccggga 
aagggcctcg 
acgtcaggtg 
atacattcaa 

60 tgaaaaagga 
gcattttgcc 
gatcagttgg 
gagagttttc 
ggcgcggtat 

65 tctcagaatg 
acagtaagag 
cttctgacaa 



atgtacgggc 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctct 
ctcactatag 
ctctcaaaag 
ttgatattca 
aagacaatct 
tgagtgacat 
gcaccatgcc 
attatatttt 
gtgaggtaaa 
tgaaattagg 
ggacaagcgg 
ccaatttgcg 
acgaaagggg 
ataaagagac 
gcaatatcat 
cgaacgctga 
aagactatag 
tgagcgaaat 
gttattatcc 
cgcctctgaa 
aggacgctat 
gcgtaccgtt 
aacattacgt 
taaaaggggt 
aaagatcatt 
aggccagaat 
gaggccagaa 
gctcgaattc 
aacttaatcg 
gcaccgatcg 
attttctcct 
tctgctctga 
cctgacgggc 
gctgcatgtg 
tgatacgcct 
gcacttttcg 
atatgtatcc 
agagtatgag 
ttcctgtttt 
gtgcacgagt 
gccccgaaga 
tatcccgtat 
acttggttga 
aattatgcag 
cgatcggagg 



cagatatacg 
attagttcat 
tggctgaccg 
aacgccaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
ctggctaact 
ggagacccaa 
cgggcatgac 
cctggcccgc 
ttttgttgtc 
tgacatccac 
caagaagaag 
acgcgagagg 
agagcgttac 
ggtcgtaggg 
aagcggtggt 
ccaatacgcg 
gataagcgaa 
gagagacgca 
acatgatgaa 
tatctacatt 
cgaaaacgtc 
actgaaagtg 
gcttagttgg 
gagagtagag 
agcgataaag 
agatattatc 
atcgctcttc 
ctgactcgag 
attttcaata 
gaggcgcggc 
tgaccttggg 
actggccgtc 
ccttgcagca 
cccttcccaa 
tacgcatctg 
tgccgcatag 
ttgtctgctc 
tcagaggttt 
atttttatag 
gggaaatgtg 
gctcatgaga 
tattcaacat 
tgctcaccca 
gggttacatc 
acgttttcca 
tgacgccggg 
gtact caeca 
tgetgecata 
accgaaggag 



cgttgacatt 
ageccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gectggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
agagaaccca 
gctgactcta 
ttctgegcta 
ggtgatgcct 
aagcttgagg 
tttgecttte 
aggaaggtga 
aaagggcggt 
gtgggtcctt 
gatactcc.ee 
ggaaaagagg 
aeggaeggea 
aaaactgcaa 
cagaaggctt 
tttgeggata 
ccaacgt.tgg. 
tacttcatct 
ctgaaggaac 
actaggggat 
gtgacgaagt 
tacttccgca 
gattttattc 
ggcatagega 

ggggggcccg 

gatctgtgtg 
caagggggag 
ggagggggag 
gttttacaac 
catccccctt 
cagttgegea 
tgcggtattt 
ttaagecage 
ccggcatccg 
tcaccgtcat 
gttaatgtca 
cgcggaaccc 
caataaccct 
ttccgtgtcg 
gaaacgctgg 
gaactggatc 
atgatgagca 
caagagcaac 
gtcacagaaa 
accatgagtg 
ctaaccgctt 



gattattgac 
tggagttccg 
cccgcccatt 
attgaegtea 
ateatatgee 
atgcccagta 
tegctattae 
actcaegggg 
aaaatcaacg 
gtaggcgtgt 
ctgcttactg 
gacttaatta 
agattgtcag 
ttgagggtgg 
tgtggcaggc 
tctccacagg 
cgaaagataa 
attatgttta 
tagctgacgt 
tacaagegga 
gaactgaacg 
acataaaggc 
aggactacat 
accgactctt 
aaatattgaa 
aagagataaa 
accgtatcgc 
ccgaaaggga 
ataagggegt 
gggcaatagc 
aattegtage 
aagggcgtaa 
aagagcaata 
tcgacctcga 
ttggtttttt 
ggggaggeca 
gecagaatga 
gtcgtgactg 
tcgccagctg 
gectgaatgg 
cacaccgcat 
cccgacaccc 
cttacagaca 
caccgaaacg 
tgataataat 
ctatttgttt 
gataaatget 
cccttattcc 
tgaaagtaaa 
teaacagegg 
cttttaaagt 
tcggtcgccg 
agcatcttac 
ataacactgc 
ttttgcacaa 



tagttattaa 
cgttacataa 
gaegtcaata 
atgggtggac 
aagtacgccc 
catgacctta 
catggtgatg 
atttccaagt 
ggactttcca 
acggtgggag 
gcttatcgaa 
agcgttgggg 
tttccaaaaa 
ccgcgtccat 
ttgagatctg 
tgtccactcc 
gaegegttat 
caagctagag 
cgttgaatca 
tccccccggt 
aegtaaaata 
gttctacaac 
caatgetata 
tgcacgtttc 
ageggtaaag 
aaggacgtta 
tctcgagtcg 
catttgeggt 
cttctatgta 
ggactttgaa 
gtctaagatg 
accgacacgc 
taaaaagtat 
gatccaggcg 
gtgtgccttg 
gaatgacctt 
ggcgcgcccc 
ggaaaaccct 
gegtaatage 
egaatggege 
atggtgcact 
gccaacaccc 
agctgtgacc 
cgegagaega 
ggtttcttag 
atttttctaa 
tcaataatat 
ettttttgeg 
agatgetgaa 
taagatcctt 
tetgetatgt 
catacactat 
ggatggcatg 
ggecaactta 
catgggggat 



60 
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240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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catgtaactc gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag 3360 
cgtgacacca cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa 3420 
ctacttactc tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca 3480 
ggaccacttc tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc 3540 
ggtgagcgtg ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt 3600 
atcgtagtta tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc 3660 
gctgagatag gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat 3720 
atactttaga ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt 3780 
tttgataatc tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac 384 0 
cccgtagaaa agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc 3900 
ttgcaaacaa aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca 3960 
actctttttc cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta 4020 
gtgtagccgt agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct 4080 
ctgctaatcc tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg 4140 
^ gactcaagac gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc 4200 
acacagccca gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta 4260 
tgagaaagcg ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg 4320 
gtcggaacag gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt 4380 
cctgtcgggt ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg 4 440 
cggagcctat ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg 4500 
ccttttgctc acatgttctt tcctgcgtta tcccctgatt ctgtggataa -dcgtattacc 4560 
gcctttgagt gagctgatac cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg 4 620 
agcgaggaag cggaagagcg cccaatacgc aaaccgcctc tccccgcgcg ttggccgatt 4 680 
cattaatgca gctggcacga caggtttccc gactggaaag cgggcagtga gcgcaacgca 4740 
attaatgtga gttagctcac tcattaggca ccccaggctt tacactttat gcttccggct 4800 * 
cgtatgttgt gtggaattgt gagcggataa caatttcaca caggaaacag ctatgaccat 4 8 60 
gattacgcca agctagcccg ggctagcttg catgcctgca ggttt 4905 

30 <210> 76 

<211> 5290 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: vector 
pCMVXisA 

<400> 76 

40 agtccgatgt acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt 60 
aatcaattac ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta 120 
cggtaaatgg cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga 180 
cgtatgttcc catagtaacg ccaataggga ctttccattg acgtcaatgg gtggactatt 240 
tacggtaaac tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta 300 
ttgacgtcaa tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg 360 
actttcctac ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt 420 
tttggcagta catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc 480 
accccattga cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat 540 
gtcgtaacaa ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct 600 
atataagcag agctctctgg ctaactagag aacccactgc ttactggctt atcgaaatta 660 
atacgactca ctatagggag acccaagctg actctagact taattaagcg ttggggtgag 720 
tactccctct caaaagcggg catgacttct gcgctaagat tgtcagtttc caaaaacgag 780 
gaggatttga tattcacctg gcccgcggtg atgcctttga gggtggccgc gtccatctgg 840 
tcagaaaaga caatcttttt gttgtcaagc ttgaggtgtg gcaggcttga gatctggcca 900 
u tacacttgag tgacattgac atccactttg cctttctctc cacaggtgtc cactcccagg 960 
gcggccgccc gatatgcaaa atcagggtca agacaaatat caacaagcct ttgcagactt 1020 
agagccactc tcatctaccg acggcagttt tctcggctca agtctgcaag cacagcagca 1080 
aagagaacac atgagaacaa aagtactaca agacctagac aaggtaaatc tgcgtttgaa 1140 
gtctgcaaag acgaaagtct cagttcgaga atctaacgga agtctgcaat tacgagcaac 1200. 
gttaccaatt aaacctggag ataaggacac caacggtaca ggcagaaagc aatacaatct 1260 
cagcttgaat atccctgcaa acttggatgg actgaagacg gctgaggaag aagcttatga 1320 
attaggtaaa ttaatcgctc ggaaaacctt tgaatggaat gataaatatt taggcaaaga 1380 
agccactaaa aaagattcac aaacaatagg tgatttacta gaaaaatttg cagaagagta 1440 
ttttaaaacc cataaacgca ccactaaaag cgaacatacc tttttttact atttttcccg 1500 
cacccaacga tataccaatt ccaaagattt agcaacggcg gaaaatctca tcaattcaat 1560 
tgagcaaatc gataaagaat gggcgagata taatgccgcc agagccatat cagctttttg 1620 
cataacattc aatatagaaa ttgatttgtc ccagtattcc aaaatgcctg atcgcaattc 1680 



50 



60 



65 
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gcgcaacatc cccacagatg cagaaatact 
agttaccaga ggaaatcaag ttaatgaaga 
gacatatgga atgttagcag tttttggttt 
tattgattgg tggttaagca aagagaatat 
5 taaaactggt gaaagacaag cattaccctt 
aagaaatccg aaatatttag aaatgctggc 
tcatgctgaa ataacagcct taactcagcg 
agattttaaa ccctatgatt tacgtcacgc 
accaatcaaa gcggcggctg ataatttggg 

10 tcagcgctgg ttctcgctag atatgcggaa 
gaatgaattt gaggtgatta gggaggagaa 
gaggatggaa attgagaagt taaagatgga 
gtcgacctcg agatccaggc gcggatcaat 
gttggttttt tgtgtgcctt gggggagggg 

15 gggggaggcc agaatgacct tgggggaggg 
ggccagaatg aggcgcgccc ccgggtaccg 
cgtcgtgact gggaaaaccc tggcgttacc 
ttcgccagct ggcgtaatag cgaagaggcc. 
agcctgaatg gcgaatggcg cctgatgcgg 

20 tcacaccgca tatggtgcac tctcagtaca 
ccccgacacc cgccaacacc cgctgacgcg 
gcttacagac aagctgtgac cgtctccggg 
tcaccgaaac gcgcgagacg aaagggcctc 
atgataataa tggtttctta gacgtcaggt 

25 cctatttgtt tatttttcta- aatacattca 
tgataaatgc ttcaataata ttgaaaaagg 
gcccttattc ccttttttgc ggcattttgc 
gtgaaagtaa aagatgctga agatcagttg 
ctcaacagcg gtaagatcct tgagagtttt 

30 acttttaaag ttctgctatg tggcgcggta 
ctcggtcgcc gcatacacta ttctcagaat 
aagcatctta cggatggcat gacagtaaga 
gataacactg cggccaactt acttctgaca 
tttttgcaca acatggggga tcatgtaact 

35 gaagccatac caaacgacga gcgtgacacc 
cgcaaactat taactggcga actacttact 
atggaggcgg ataaagttgc aggaccactt 
attgctgata aatctggagc cggtgagcgt 
ccagatggta agccctcccg tatcgtagtt 

40 gatgaacgaa atagacagat cgctgagata 
tcagaccaag tttactcata tatactttag 
aggatctagg tgaagatcct ttttgataat 
tcgttccact gagcgtcaga ccccgtagaa 
tttctgcgcg taatctgctg cttgcaaaca 

45 ttgccggatc aagagctacc aactcttttt 
ataccaaata ctgtccttct agtgtagccg 
gcaccgccta catacctcgc tctgctaatc 
aagtcgtgtc ttaccgggtt ggactcaaga 
ggctgaacgg ggggttcgtg cacacagccc 

50 agatacctac agcgtgagct atgagaaagc 
aggtatccgg taagcggcag ggtcggaaca 
aacgcctggt atctttatag tcctgtcggg 
ttgtgatgct cgtcaggggg gcggagccta 
cggttcctgg ccttttgctg gccttttgct 

55 tctgtggata accgtattac cgcctttgag 
accgagcgca gcgagtcagt gagcgaggaa 
ctccccgcgc gttggccgat tcattaatgc 
gcgggcagtg agcgcaacgc aattaatgtg 
ttacacttta tgcttccggc tcgtatgttg 

60 acaggaaaca gctatgacca tgattacgcc 
aggtttaaac 
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atcaggaatt accaaatttg aagactatct 17 40 
tgtaaaagat agctggcaac tttggcgctg 1800 
acgccccagg gaaattttta ttaaccctaa 1860 
agacctcaca tggaaagtag acaaagaatg 1920 
acataaagaa tggattgatg agtttgattt 1980 
aacagcaatt agtaaaaaag ataaaacaaa 2040 
tattagttgg tggtttcgga aagtcgaatt 2100 
ctgggcaatt agagcgcata ttttaggcat 2160 
gcatagtatg caggttcata cacaaaccta 2220 
gttagcgatt aatcaggctt tgactaagag 2280 
tgctaaattg cagatagaaa atgaaaggtt 2340 
aatagcttat aagaatagtt gagcggccgc 2400 
aaaagatcat tattttcaat agatctgtgt 24 60 
gaggccagaa tgaggcgcgg ccaaggggga 2520 
ggaggccaga atgaccttgg gggaggggga 2580 
agctcgaatt cactggccgt cgttttacaa 2640 
caacttaatc gccttgcagc acatccccct 2700 
cgcaccgatc gcccttccca acagttgcgc 27 60 
tattttctcc ttacgcatct gtgcggtatt 2820 
atctgctctg atgccgcata gttaagccag 2880 
ccctgacggg cttgtctgct cccggcatcc 2940 
agctgcatgt gtcagaggtt ttcaccgtca 3000 
gtgatacgcc tatttttata ggttaatgtc 3060 
ggcacttttc ggggaaatgt gcgcggaacc 3120 
aatatgtatc cgctcatgag acaataaccc 3180 
aagagtatga gtattcaaca tttccgtgtc 3240 
cttcctgttt ttgctcaccc agaaacgctg 3300 
ggtgcacgag tgggttacat cgaactggat 3360 
cgccccgaag aacgttttcc aatgatgagc 3420 
ttatcccgta ttgacgccgg gcaagagcaa 34 80 
gacttggttg agtactcacc agtcacagaa 3540 
gaattatgca gtgctgccat aaccatgagt 3600 
acgatcggag gaccgaagga gctaaccgct 3660 
cgccttgatc gttgggaacc ggagctgaat 3720 
acgatgcctg tagcaatggc aacaacgttg 3780 
ctagcttccc ggcaacaatt aatagactgg 3840 
ctgcgctcgg cccttccggc tggctggttt 3900 
gggtctcgcg gtatcattgc agcactgggg 3960 
atctacacga cggggagtca ggcaactatg 4020 
ggtgcctcac tgattaagca ttggtaactg 4080 
attgatttaa aacttcattt ttaatttaaa 4140 
ctcatgacca aaatccctta acgtgagttt 4 200 
aagatcaaag gatcttcttg agatcctttt 4260 
aaaaaaccac cgctaccagc ggtggtttgt 4320 
ccgaaggtaa ctggcttcag cagagcgcag 4380 
tagttaggcc accacttcaa gaactctgta 4440 
ctgttaccag tggctgctgc cagtggcgat 4500 
cgatagttac cggataaggc gcagcggtcg 4560 
agcttggagc gaacgaccta caccgaactg 4 620 
gccacgcttc ccgaagggag aaaggcggac 4 680 
ggagagcgca cgagggagct tccaggggga 4740 
tttcgccacc tctgacttga gcgtcgattt 4800 
tggaaaaacg ccagc.aacgc ggccttttta 4860 
cacatgttct ttcctgcgtt atcccctgat 4 920 
tgagctgata ccgctcgccg cagccgaacg 4 980 
gcggaagagc gcccaatacg caaaccgcct 5040 
agctggcacg acaggtttcc cgactggaaa 5100 
agttagctca ctcattaggc accccaggct 5160 
tgtggaattg tgagcggata acaatttcac 5220 
aagctagccc gggctagctt gcatgcctgc 5280 
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<223> Description of Artificial Sequence: vector 
pCMVXisANNLS ' 

<400> 77 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 
cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 
atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 
ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 
cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 
cagggcggcc gcaccatgcc caagaagaag aggaaggtgc aaaatcaggg tcaagacaaa 1020 
tatcaacaag cctttgcaga cttagagcca ctttcatcta ccgacggcag ttttctcggc 1080 
tcaagtctgc aagcacagca gcaaagagaa cacatgagaa caaaagtact acaagaccta 1140 
gacaaggtaa atctgcgttt gaagtctgca aagacgaaag tctcagttcg agaatctaac 1200 
ggaagtctgc aattacgagc aacgttacca attaaacctg gagataagga caccaacggt 1260 
acaggcagaa agcaatacaa tctcagcttg aatatccctg caaacttgga tggactgaag 1320 
acggctgagg aagaagctta tgaattaggt aaattaatcg ctcggaaaac ctttgaatgg 1380 
aatgataaat atttaggcaa agaagccact aaaaaagatt cacaaacaat aggtgattta 1440 
ctagaaaaat ttgcagaaga gtattttaaa acccataaac gcaccactaa aagcgaacat 1500 
accttttttt actatttttc ccgcacccaa cgatatacca attccaaaga tttagcaacg 1560 
gcggaaaatc tcatcaattc aattgagcaa atcgataaag aatgggcgag atataatgcc 1620 
gccagagcca tatcagcttt ttgcataaca ttcaatatag aaattgattt gtcccagtat 1680 
tccaaaatgc ctgatcgcaa ttcgcgcaac atccccacag atgcagaaat actatcagga 1740 
attaccaaat ttgaagacta tctagttacc agaggaaatc aagttaatga agatgtaaaa 1800 
gatagctggc aactttggcg ctggacatat ggaatgttag cagtttttgg tttacgcccc 1860 
agggaaattt ttattaaccc taatattgat tggtggttaa gcaaagagaa tatagacctc 1920 
acatggaaag tagacaaaga atgtaaaact ggtgaaagac aagcattacc cttacataaa 1980 
gaatggattg atgagtttga tttaagaaat ccgaaatatt tagaaatgct ggcaacagca 2040 
attagtaaaa aagataaaac aaatcatgct gaaataacag ccttaactca gcgtattagt 2100 
tggtggtttc ggaaagtcga attagatttt aaaccctatg atttacgtca cgcctgggca 2160 
atcagagcgc atattttagg cataccaatc aaagcggcgg ctgataattt ggggcatagt 2220 
atgcaggttc atacacaaac ctatcagcgc tggttctcgc tagatatgcg gaagttagcg 2280 
attaatcagg ctttgactaa gaggaatgaa tttgaggtga ttagggagga gaatgctaaa 2340 
ttgcagatag aaaatgaaag gttgaggatg gaaattgaga agttaaagat ggaaatagct 2400 
tataagaata gttgagcggc cgcgtcgacc tcgagatcca ggcgcggatc aataaaagat 2460 
cattattttc aatagatctg tgtgttggtt ttttgtgtgc cttgggggag ggggaggcca 2520 
gaatgaggcg cggccaaggg ggagggggag gccagaatga ccttggggga gggggaggcc 2580 
agaatgacct tgggggaggg ggaggccaga atgaggcgcg cccccgggta ccgagctcga 2640 
attcactggc cgtcgtttta caacgtcgtg actgggaaaa ccctggcgtt acccaactta 2700 
atcgccttgc agcacatccc cctttcgcca gctggcgtaa tagcgaagag gcccgcaccg 2760 
atcgcccttc ccaacagttg cgcagcctga atggcgaatg gcgcctgatg cggtattttc 2820 
tccttacgca tctgtgcggt atttcacacc gcatatggtg cactctcagt acaatctgct 2880 
ctgatgccgc atagttaagc cagccccgac acccgccaac acccgctgac gcgccctgac 2940 
gggcttgtct gctcccggca tccgcttaca gacaagctgt gaccgtctcc gggagctgca 3000 
tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag acgaaagggc ctcgtgatac 3060 
gcctattttt ataggttaat gtcatgataa taatggtttc ttagacgtca ggtggcactt 3120 
ttcggggaaa tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt 3180 
atccgctcat gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta 3240 
tgagtattca acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg 3300 
tttttgctca cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac 3360 
gagtgggtta catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg 3420 
aagaacgttt tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc 3480 
gtattgacgc cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg 3540 
ttgagtactc accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat 3600 
gcagtgctgc cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg 3660 
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gaggaccgaa 
atcgttggga 
ctgtagcaat 
cccggcaaca 
5 cggcccttcc 
gcggtatcat 
cgacggggag 
cactgattaa 
taaaacttca 

10 ccaaaatccc 
aaggatcttc 
caccgctacc 
taactggctt 
gccaccactt 

15 cagtggctgc 
taccggataa 
agcgaacgac 
ttcccg'aagg 
gcacgaggga 

20 acctctgact 
acgccagcaa 
tctttcctgc 
ataccgctcg 
agcgcccaat 

25 acgacaggtt 
tcactcatta 
ttgtgagcgg 
cccgggctag 



30 



35 



ggagctaacc 
accggagctg 
ggcaacaacg 
attaatagac 
ggctggctgg 
tgcagcactg 
tcaggcaact 
gcattggtaa 
tttttaattt 
ttaacgtgag 
ttgagatcct 
agcggtggtt 
cagcagagcg 
caagaactct 
tgccagtggc 
ggcgcagcgg 
ctacaccgaa 
gagaaaggcg 
gcttccaggg 
tgagcgtcga 
cgcggccttt 
gttatcccct 
ccgcagccga 
acgcaaaccg 
tcccgactgg 
ggcaccccag 
ataacaattt 
cttgcatgcc 



gcttttttgc 
aatgaagcca 
ttgcgcaaac 
tggatggagg 
tttattgctg 
gggccagatg 
atggatgaac 
ctgtcagacc 
aaaaggatct 
ttttcgttcc 
ttttttctgc 
tgtttgccgg 
cagataccaa 
gtagcaccgc 
gataagtcgt 
tcgggctgaa 
ctgagatacc 
gacaggtatc 
ggaaacgcct 
tttttgtgat 
ttacggttcc 
gattctgtgg 
acgaccgagc 
cctctccccg 
aaagcgggca 
gctttacact 
cacacaggaa 
tgcaggttt 



68 

acaacatggg 
taccaaacga 
tattaactgg 
cggataaagt 
ataaatctgg 
gtaagccctc 
gaaatagaca 
aagtttactc 
aggtgaagat 
actgagcgtc 
gcgtaatctg 
atcaagagct 
atactgtcct 
ctacatacct 
gtcttaccgg 
cggggggttc 
tacagcgtga 
cggtaagcgg 
ggtatcttta 
gctcgtcagg 
tggccttttg 
ataaccgtat 
gcagcgagtc 
cgcgttggcc 
gtgagcgcaa 
ttatgcttcc 
acagctatga 



ggatcatgta 
cgagcgtgac 
cgaactact't 
tgcaggacca 
agccggtgag 
ccgtatcgta 
gatcgctgag 
atatatactt 
cctttttgat 
agaccccgta 
ctgcttgcaa 
accaactctt 
tctagtgtag 
cgctctgcta 
gttggactca 
gtgcacacag 
gctatgagaa 
cagggtcgga 
tagtcctgtc 
ggggcggagc 
ctggcctttt 
taccgccttt 
agtgagcgag 
gattcattaa 
cgcaattaat 
ggctcgtatg 
ccatgattac 



actcgccttg 
accacgatgc 
actctagctt 
cttctgcgct 
cgtgggtctc 
gttatctaca 
ataggtgcct 
tagattgatt 
aatctcatga 
gaaaagatca 
acaaaaaaac 
tttccgaagg 
ccgtagttag 
atcctgttac 
agacgatagt 
cccagcttgg 
agcgccacgc 
acaggagagc 
gggtttcgcc 
ctatggaaaa 
gctcacatgt 
gagtgagctg 
gaagcggaag 
tgcagctggc 
gtgagttagc 
ttgtgtggaa 
gccaagctag 



3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5309 



<210> 78 
<211> 7608 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pPGKnifD 



40 <400> 78 

tcgaggaatt 
tagcagcccc 
atccaccggt 
tcctccccta 

45 aatggaagta 
gcgggtaggc 
gaggctggga 
cgcccgaagg 
gctgttctcc 

50 gctcttccct 
atgtcagatc 
ttgcatgcct 
ccacgcccct 
cgcctcgcca 

55 gactaccccg 
ctgcaagaac 
gacggcgccg 
gccgagatcg 
atggaaggcc 

60 ggcgtctcgc 
gaggcggccg 
cccttctacg 
gcgacctggt 
ccgaccgaaa 

65 cgaccccgca 
atttgtagag 
taaaatgaat 



ctaccgggta 
gctgggcact 
aggcgccaac 
gtcaggaagt 
gcacgtctca 
ctttggggca 
aggggtgggt 
tcctccggag 
tcttcctcat 
tccgtcaaat 
agataagttc 
gcaggtcggc 
gacccctcac 
cccgcgacga 
ccacgcgcca 
tcttcctcac 
cggtggcggt 
gcccgcgcat 
tcctggcgcc 
ccgaccacca 
agcgcgccgg 
agcggctcgg 
gcatgacccg 
ggagcgcacg 
cccgcccccg 
gttttacttg 
gcaattgttg 



ggggaggcgc 
tggcgctaca 
cggctccgtt 
tcccccccgc 
ctagtctcgt 
gcggccaata 
ccgggggcgg 
gcccggcatt 
ctccgggcct 
gcactcttgg 
gaataacttc 
cgccacgacc 
aaggagacga 
cgtcccccgg 
caccgtcgac 
gcgcgtcggg 
ctggaccacg 
ggccgagttg 
gcaccggccc 
gggcaagggt 
ggtgcccgcc 
cttcaccgtc 
caagcccggt 
accccatggc 
aggcccaccg 
ctttaaaaaa 
ttgttaactt 



ttttcccaag 
caagtggcct 
ctttggtggc 
cccgcagctc 
gcagatggac 
gcagctttgc 
gctcaggggc 
ctgcacgctt 
ttcgacctgc 
gattactccg 
gtatagcata 
ggccggccgg 
ccttccatga 
gccgtacgca 
ccggaccgcc 
ctcgacatcg 
ccggagagcg 
agcggttccc 
aaggagcccg 
ctgggcagcg 
ttcctggaga 
accgccgacg 
gcctgacgcc 
tccgaccgaa 
actctagagg 
cctcccacac 
gtttattgca 



gcagtctgga 
ctggctcgca 
cccttcgcgc 
gcgtcgtgca 
agcaccgctg 
tccttcgctt 
gggctcaggg 
caaaagcgca 
agcccggtac 
aacctagcga 
cattatacga 
tgccgccacc 
ccgagtacaa 
ccctcgccgc 
acatcgagcg 
gcaaggtgtg 
tcgaagcggg 
ggctggccgc 
cgtggttcct 
ccgtcgtgct 
cctccgcgcc 
tcgagtgccc 
cgccccacga 
gccgacccgg 
atcataatca 
ctccccctga 
gcttataatg 



gcatgcgctt 
cacattccac 
caccttctac 
ggacgtgaca 
agcaatggaa 
tctgggctca 
gcggggcggg 
cgtctgccgc 
agttcgaatg 
tggggtgcaa 
agttataagc 
atcccctgac 
gcccacggtg 
cgcgttcgcc 
ggtcaccgag 
ggtcgcggac 
ggcggtgttc 
gcagcaacag 
ggccaccgtc 
ccccggagtg 
ccgcaacctc 
gaaggaccgc 
cccgcagcgc 
gcggccccgc 
gccataccac 
acctgaaaca 
gttacaaata 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 
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aagcaatagc atcacaaatt tcacaaataa 
tttgtccaaa ctcatcaatg tatcttatca 
accacaaaaa ggattactcc ggcccttatc 
cgtatagcat acattatacg aagttatacc 
5 tggccgtcgt tttacaacgt cgtgactggg 
ttgcagcaca tccccctttc gccagctggc 
cttcccaaca gttgcgcagc ctgaatggcg 
aagcggtgcc ggaaagctgg ctggagtgcg 
cctcaaactg gcagatgcac ggttacgatg 

10 ttacggtcaa tccgccgttt gttcccacgg 
ttaatgttga tgaaagctgg ctacaggaag 
actcggcgtt tcatctgtgg tgcaacgggc 
tgccgtctga atttgacctg agcgcatttt 
tggtgctgcg ttggagtgac ggcagttatc 

15 gcattttccg tgacgtctcg ttgctgcata 
ttgccactcg ctttaatgat gatttcagcc 
gcggcgagtt gcgtgactac ctacgggtaa 
. tcgccagcgg caccgcgcct ttcggcggtg 
atcgcgtcac actacgtctg aacgtcgaaa 

20 atctctatcg tgcggtggtt gaactgcaca 
cctgcgatgt cggtttccgc gaggtgcgga 
agccgttgct gattcgaggc gttaaccgtc 
tggatgagca gacgatggtg caggatatcc 
tgcgctgttc gcattatccg aaccatccgc 

25 tgtatgtggt ggatgaagcc aatattgaaa 
ccgatgatcc gcgctggcta ccggcgatga 
atcgtaatca cccgagtgtg atcatctggt 
atcacgacgc gctgtatcgc tggatcaaat 
aaggcggcgg agccgacacc acggccaccg 

30 atgaagacca gcccttcccg gctgtgccga 
ctggagagac gcgcccgctg atcctttgcg 
gcggtttcgc taaatactgg caggcgtttc 
tctgggactg ggtggatcag tcgctgatta 
cttacggcgg tgattttggc gatacgccga 

35 tctttgccga ccgcacgccg catccagcgc 
tccagttccg tttatccggg caaaccatcg 
gcgataacga gctcctgcac tggatggtgg 
aagtgcctct ggatgtcgct ccacaaggta 
agccggagag cgccgggcaa ctctggctca 

40 catggtcaga agccgggcac atcagcgcct 
gtgtgacgct ccccgccgcg tcccacgcca 
tttgcatcga gctgggtaat aagcgttggc 
agatgtggat tggcgataaa aaacaactgc 
caccgctgga taacgacatt ggcgtaagtg 

45 tcgaacgctg gaaggcggcg ggccattacc 
cagatacact tgctgatgcg gtgctgatta 
aaaccttatt tatcagccgg aaaacctacc 
ccgttgatgt tgaagtggcg agcgatacac 
agctggcgca ggtagcagag cgggtaaact 

50 ccgaccgcct tactgccgcc tgttttgacc 
ccccgtacgt cttcccgagc gaaaacggtc 
gcccacacca gtggcgcggc gacttccagt 
tgatggaaac cagccatcgc catctgctgc 
acggtttcca tatggggatt ggtggcgacg 

55 tccagctgag cgccggtcgc taccattacc 
cgggcagggg ggatctttgt gaaggaacct 
ctacctacag agatttaaag ctctaaggta 
aaactactga ttctaattgt ttgtgtattt 
gagcagtggt ggaatgccag atccagacat 

60 cacaactaga atgcagtgaa aaaaatgctt 
atttgtaacc attataagct gcaataaaca 
gtttcaggtt cagggggagg tgtgggaggt 
tggtatggct gattatgatc tgcggccgca 
taatgtcatg ataataatgg tttcttagac 

65 cggaacccct atttgtttat ttttctaaat 
ataaccctga taaatgcttc aataatattg 
ccgtgtcgcc cttattccct tttttgcggc 
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agcatttttt tcactgcatt ctagttgtgg 1680 

tgtctggatc cagctgttga aagctattaa 1740 

acggttacga cggatttgga tccataactt 1800 

gggccaccat ggtcgcgagt agcttggcac 1860 

aaaaccctgg cgttacccaa cttaatcgcc 1920 

gtaatagcga agaggcccgc accgatcgcc 1980 

aatggcgctt tgcctggttt ccggcaccag 2040 

atcttcctga ggccgatact gtcgtcgtcc 2100 

cgcccatcta caccaacgta acctatccca 2160 

agaatccgac gggttgttac tcgctcacat 2220 

gccagacgcg aattattttt gatggcgtta 2280 

gctgggtcgg ttacggccag gacagtcgtt 2340 

tacgcgccgg agaaaaccgc ctcgcggtga 2400 

tggaagatca ggatatgtgg cggatgagcg 24 60 

aaccgactac acaaatcagc gatttccatg 2520 " 

gcgctgtact ggaggctgaa gttcagatgt 2580 

cagtttcttt atggcagggt gaaacgcagg 2640 

aaattatcga tgagcgtggt ggttatgccg 2700 

acccgaaact gtggagcgcc gaaatcccga 2760 

ccgccgacgg cacgctgatt gaagcagaag 2820 

ttgaaaatgg tctgctgctg ctgaacggca 2880 

acgagcatca tcctctgcat ggtcaggtca 2940 

tgctgatgaa gcagaacaac tttaacgccg 3000 

tgtggtacac gctgtgcgac cgctacggcc 3060 

cccacggcat ggtgccaatg aatcgtctga 3120 

gcgaacgcgt aacgcgaatg gtgcagcgcg 3180 

cgctggggaa tgaatcaggc cacggcgcta 3240 

ctgtcgatcc ttcccgcccg gtgcagtatg 3300 

atattatttg cccgatgtac gcgcgcgtgg 3360 

aatggtccat caaaaaatgg ctttcgctac 3420 

aatacgccca cgcgatgggt aacagtcttg 3480 

gtcagtatcc ccgtttacag ggcggcttcg 3540 

aatatgatga aaacggcaac ccgtggtcgg 3600 

acgatcgcca gttctgtatg aacggtctgg 3660 

tgacggaagc aaaacaccag cagcagtttt 3720 

aagtgaccag cgaatacctg ttccgtcata 3780 

cgctggatgg taagccgctg gcaagcggtg 3840 

aacagttgat tgaactgcct gaactaccgc 3900 

cagtacgcgt agtgcaaccg aacgcgaccg 3960 

ggcagcagtg gcgtctggcg gaaaacctca 4020 

tcccgcatct gaccaccagc gaaatggatt 4080 

aatttaaccg ccagtcaggc tttctttcac .4140 

tgacgccgct gcgcgatcag ttcacccgtg 4200 

aagcgacccg cattgaccct aacgcctggg 4260 

aggccgaagc agcgttgttg cagtgcacgg 4320 

cgaccgctca cgcgtggcag catcagggga 4380 

ggattgatgg tagtggtcaa atggcgatta 4440 

cgcatccggc gcggattggc ctgaactgcc 4500 

ggctcggatt agggccgcaa gaaaactatc 4560 

gctgggatct gccattgtca gacatgtata 4620 

tgcgctgcgg gacgcgcgaa ttgaattatg 4 680 

tcaacatcag ccgctacagt caacagcaac 4740 

acgcggaaga aggcacatgg ctgaatatcg 4800 

actcctggag cccgtcagta tcggcggaat 4860 

agttggtctg gtgtcaaaaa taataataac 4920 

tacttctgtg gtgtgacata attggacaaa 4980 

aatataaaat ttttaagtgt ataatgtgtt 5040 

tagattccaa cctatggaac tgatgaatgg 5100 

gataagatac attgatgagt ttggacaaac 5160 

tatttgtgaa atttgtgatg ctattgcttt 5220 

agttaacaac aacaattgca ttcattttat 5280 

tttttaaagc aagtaaaacc tctacaaatg 5340 

gggcctcgtg atacgcctat ttttataggt 5400 

gtcaggtggc acttttcggg gaaatgtgcg 54 60 

acattcaaat atgtatccgc tcatgagaca 5520 

aaaaaggaag agtatgagta ttcaacattt 5580 

attttgcctt cctgtttttg ctcacccaga 5640 
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aacgctggtg 
actggatctc 
gatgagcact 
agagcaactc 
5 cacagaaaag 
catgagtgat 
aaccgctttt 
gctgaatgaa 
aacgttgcgc 

10 agactggatg 
ctggtttatt 
actggggcca 
aactatggat 
gtaactgtca 

15 atttaaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 

20 ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 

25 agggggaaac 
tcgatttttg 
ctttttacgg 
ccctgattct 
ccgaacgacc 

30 accgcctctc 
ctggaaagcg 
ccaggcttta 
atttcacaca 



35 



40 



aaagtaaaag 
aacagcggta 
tttaaagttc 
ggtcgccgca 
catcttacgg 
aacactgcgg 
ttgcacaaca 
gccataccaa 
aaactattaa 
gaggcggata 
gctgataaat 
gatggtaagc 
gaacgaaata 
gaccaagttt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 
tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
gtggataacc 
gagcgcagcg 
cccgcgcgtt 
ggcagtgagc 
cactttatgc 
ggaaacagct 



atgctgaaga 
agatccttga 
tgctatgtgg 
tacactattc 
atggcatgac 
ccaacttact 
tgggggatca 
acgacgagcg 
ctggcgaact 
aagttgcagg 
ctggagccgg 
cctcccgtat 
gacagatcgc 
actcatatat 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
tccttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 
gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
gtattaccgc 
agtcagtgag 
ggccgattca 
gcaacgcaat 
ttccggctcg 
atgaccatga 



70 

tcagttgggt 
gagttttcgc 
cgcggtatta 
tcagaatgac 
agtaagagaa 
tctgacaacg 
tgtaactcgc 
tgacaccacg 
acttactcta 
accacttctg 
tgagcgtggg 
cgtagttatc 
tgagataggt 
actttagatt 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 
cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
ctttgagtga 
cgaggaagcg 
ttaatgcagc 
taatgtgagt 
tatgttgtgt 
ttacgccaag 
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gcacgagtgg 
cccgaagaac 
tcccgtattg 
ttggttgagt 
ttatgcagtg 
atcggaggac 
cttgatcgtt 
atgcctgtag 
gcttcccggc 
cgctcggccc 
tctcgcggta 
tacacgacgg 
gcctcactga 
gatttaaaac 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 
gagcgcacga 
cgccacctct 
aaaaacgcca 
atgttctttc 
gctgataccg 
gaagagcgcc 
tggcacgaca 
tagctcactc 
ggaattgtga 
ctggcgcg 



gttacatcga 
gttttccaat 
acgccgggca 
actcaccagt 
ctgccataac 
cgaaggagct 
gggaaccgga 
caatggcaac 
aacaattaat 
ttccggctgg 
tcattgcagc 
ggagtcaggc 
ttaagcattg 
ttcattttta 
tcccttaacg 
cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 
gggagcttcc 
gacttgagcg 
gcaacgcggc 
ctgcgttatc 
ctcgccgcag 
caatacgcaa 
ggtttcccga 
attaggcacc 
gcggataaca 



5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7608 



<210> 79 
<211> 7523 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pPGKnifD3' 



45 <400> 79 

tcgaggaatt 
tagcagcccc 
atccaccggt 
tcctccccta 

50 aatggaagta 
gcgggtaggc 
gaggctggga 
cgcccgaagg 
gctgttctcc 

55 acttcgtata 
cgaccggccg 
gacgaccttc 
cccgggccgt 
tcgacccgga 

60 tcgggctcga 
ccacgccgga 
agttgagcgg 
ggcccaagga 
agggtctggg 

65 ccgccttcct 
ccgtcaccgc 
ccggtgcctg 



ctaccgggta 
gctgggcact 
aggcgccaac 
gtcaggaagt 
gcacgtctca 
ctttggggca 
aggggtgggt 
tcctccggag 
tcttcctcat 
gcatacatta 
gccggtgccg 
catgaccgag 
acgcaccctc 
ccgccacatc 
catcggcaag 
gagcgtcgaa 
ttcccggctg 
gcccgcgtgg 
cagcgccgtc 
ggagacctcc 
cgacgtcgag 
acgcccgccc 



ggggaggcgc 
tggcgctaca 
cggctccgtt 
tcccccccgc 
ctagtctcgt 
gcggccaata 
ccgggggcgg 
gcccggcatt 
ctccgggcct 
tacgaagtta 
ccaccatccc 
tacaagccca 
gccgccgcgt 
gagcgggtca 
gtgtgggtcg 
gcgggggcgg 
gccgcgcagc 
ttcctggcca 
gtgctccccg 
gcgccccgca 
tgcccgaagg 
cacgacccgc 



ttttcccaag 
caagtggcct 
ctttggtggc 
cccgcagctc 
gcagatggac 
gcagctttgc 
gctcaggggc 
ctgcacgctt 
ttcgacctgc 
taagcttgca 
ctgacccacg 
cggtgcgcct 
tcgccgacta 
ccgagctgca 
cggacgacgg 
tgttcgccga 
aacagatgga 
ccgtcggcgt 
gagtggaggc 
acctcccctt 
accgcgcgac 
agcgcccgac 



gcagtctgga 
ctggctcgca 
cccttcgcgc 
gcgtcgtgca 
agcaccgctg 
tccttcgctt 
gggctcaggg 
caaaagcgca 
agcccggtac 
tgcctgcagg 
cccctgaccc 
cgccacccgc 
ccccgccacg 
agaactcttc 
cgccgcggtg 
gatcggcccg 
aggcctcctg 
ctcgcccgac 
ggccgagcgc 
ctacgagcgg 
ctggtgcatg 
cgaaaggagc 



gcatgcgctt 
cacattccac 
caccttctac 
ggacgtgaca 
agcaatggaa 
tctgggctca 
gcggggcggg 
cgtctgccgc 
agttcgaata 
tcggccgcca 
ctcacaagga 
gacgacgtcc 
cgccacaccg 
ctcacgcgcg 
gcggtctgga 
cgcatggccg 
gcgccgcacc 
caccagggca 
gccggggtgc 
ctcggcttca 
acccgcaagc 
gcacgacccc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 

atggctccga ccgaagccga cccgggcggc 
caccgactct agaggatcat aatcagccat 
aaaaacctcc cacacctccc cctgaacctg 
aacttgttta ttgcagctta taatggttac 
5 aataaagcat ttttttcact gcattctagt 
tatcatgtct ggatccagct gttgaaagct 
ttatcacggt tacgacggat ttggatccat 
ataccgggcc accatggtcg cgagtagctt 
ctgggaaaac cctggcgtta cccaacttaa 

10 ctggcgtaat agcgaagagg cccgcaccga 
tggcgaatgg cgctttgcct ggtttccggc 
gtgcgatctt cctgaggccg atactgtcgt 
cgatgcgccc atctacacca acgtaaccta 
cacggagaat ccgacgggtt gttactcgct 

15 ggaaggccag acgcgaatta tttttgatgg 
cgggcgctgg gtcggttacg gccaggacag 
atttttacgc gccggagaaa accgcctcgc 
ttatctggaa gatcaggata tgtggcggat 
gcataaaccg actacacaaa tcagcgattt 

.20 cagccgcgct gtactggagg ctgaagttca 
ggtaacagtt tctttatggc agggtgaaac 
cggtgaaatt atcgatgagc gtggtggtta 
cgaaaacccg aaactgtgga gcgccgaaat 
gcacaccgcc gacggcacgc tgattgaagc 

25 gcggattgaa aatggtctgc tgctgctgaa 
ccgtcacgag catcatcctc tgcatggtca 
tatcctgctg atgaagcaga acaactttaa 
tccgctgtgg tacacgctgt gcgaccgcta 
tgaaacccac ggcatggtgc caatgaatcg 

30 gatgagcgaa cgcgtaacgc gaatggtgca 
ctggtcgctg gggaatgaat caggccacgg 
caaatctgtc gatccttccc gcccggtgca 
caccgatatt atttgcccga tgtacgcgcg 
gccgaaatgg tccatcaaaa aatggctttc 

35 ttgcgaatac gcccacgcga tgggtaacag 
gtttcgtcag tatccccgtt tacagggcgg 
gattaaatat gatgaaaacg gcaacccgtg 
gccgaacgat cgccagttct gtatgaacgg 
agcgctgacg gaagcaaaac accagcagca 

40 catcgaagtg accagcgaat acctgttccg 
ggtggcgctg gatggtaagc cgctggcaag 
aggtaaacag ttgattgaac tgcctgaact 
gctcacagta cgcgtagtgc aaccgaacgc 
cgcctggcag cagtggcgtc tggcggaaaa 

45 cgccatcccg catctgacca ccagcgaaat 
ttggcaattt aaccgccagt caggctttct 
actgctgacg ccgctgcgcg atcagttcac 
aagtgaagcg acccgcattg accctaacgc 
ttaccaggcc gaagcagcgt tgttgcagtg 

50 gattacgacc gctcacgcgt ggcagcatca 
ctaccggatt gatggtagtg gtcaaatggc 
tacaccgcat ccggcgcgga ttggcctgaa 
aaactggctc ggattagggc cgcaagaaaa 
tgaccgctgg gatctgccat tgtcagacat 

55 cggtctgcgc tgcgggacgc gcgaattgaa 
ccagttcaac atcagccgct acagtcaaca 
gctgcacgcg gaagaaggca catggctgaa 
cgacgactcc tggagcccgt cagtatcggc 
ttaccagttg gtctggtgtc aaaaataata 

60 aaccttactt ctgtggtgtg acataattgg 
aggtaaatat aaaattttta agtgtataat 
tattttagat tccaacctat ggaactgatg 
gacatgataa gatacattga tgagtttgga 
tgctttattt gtgaaatttg tgatgctatt 

65 aaacaagtta acaacaacaa ttgcattcat 
gaggtttttt aaagcaagta aaacctctac 
ccgcagggcc tcgtgatacg cctattttta 
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cccgccgacc ccgcacccgc ccccgaggcc 1380 
accacatttg tagaggtttt acttgcttta 1440 
aaacataaaa tgaatgcaat tgttgttgtt 1500 
aaataaagca atagcatcac aaatttcaca 1560 
tgtggtttgt ccaaactcat caatgtatct 1620 
attaaaccac aaaaaggatt actccggccc 1680 
aacttcgtat agcatacatt atacgaagtt 1740 
ggcactggcc gtcgttttac aacgtcgtga 1800 
tcgccttgca gcacatcccc ctttcgccag 1860 
tcgcccttcc caacagttgc gcagcctgaa 1920 
accagaagcg gtgccggaaa gctggctgga 1980 
cgtcccctca aactggcaga tgcacggtta 2040 
tcccattacg gtcaatccgc cgtttgttcc 2100 
cacatttaat gttgatgaaa gctggctaca 2160 
cgttaactcg gcgtttcatc tgtggtgcaa 2220 
tcgtttgccg tctgaatttg acctgagcgc 2280 
ggtgatggtg ctgcgttgga gtgacggcag 2340 
gagcggcatt ttccgtgacg tctcgttgct 2400 
ccatgttgcc actcgcttta atgatgattt 24 60. 
gatgtgcggc gagttgcgtg actacctacg 2520 
gcaggtcgcc agcggcaccg cgcctttcgg 2580 
tgccgatcgc gtcacactac gtctgaacgt 264 0 
cccgaatctc tatcgtgcgg tggttgaact 2700 
agaagcctgc gatgtcggtt tccgcgaggt 2760 
cggcaagccg ttgctgattc gaggcgttaa 2820 
ggtcatggat gagcagacga tggtgcagga 2880 
cgccgtgcgc tgttcgcatt atccgaacca 2940 
cggcctgtat gtggtggatg aagccaatat 3000 
tctgaccgat gatccgcgct ggctaccggc 3060 
gcgcgatcgt aatcacccga gtgtgatcat 3120 
cgctaatcac gacgcgctgt atcgctggat 3180 
gtatgaaggc ggcggagccg acaccacggc 3240 
cgtggatgaa gaccagccct tcccggctgt 3300 
gctacctgga gagacgcgcc cgctgatcct 3360 
tcttggcggt ttcgctaaat actggcaggc 3420 
cttcgtctgg gactgggtgg atcagtcgct 3480 
gtcggcttac ggcggtgatt ttggcgatac 3540 
tctggtcttt gccgaccgca cgccgcatcc 3600 
gtttttccag ttccgtttat ccgggcaaac 3660 
tcatagcgat aacgagctcc tgcactggat 3720 
cggtgaagtg cctctggatg tcgctccaca 3780 
accgcagccg gagagcgccg ggcaactctg 3840 
gaccgcatgg tcagaagccg ggcacatcag 3900 
cctcagtgtg acgctccccg ccgcgtccca 3960 
ggatttttgc atcgagctgg gtaataagcg 4020 
ttcacagatg tggattggcg ataaaaaaca 4080 
ccgtgcaccg ctggataacg acattggcgt 4140 
ctgggtcgaa cgctggaagg cggcgggcca 4200 
cacggcagat acacttgctg atgcggtgct 4260 
ggggaaaacc ttatttatca gccggaaaac 4320 
gattaccgtt gatgttgaag tggcgagcga 4380 
ctgccagctg gcgcaggtag cagagcgggt 44 40 
ctatcccgac cgccttactg ccgcctgttt 4 500 
gtataccccg tacgtcttcc cgagcgaaaa 4560 
ttatggccca caccagtggc gcggcgactt 4 620 
gcaactgatg gaaaccagcc atcgccatct 4 680 
tatcgacggt ttccatatgg ggattggtgg 4740 
ggaattccag ctgagcgccg gtcgctacca 4800 
ataaccgggc aggggggatc tttgtgaagg 4 8 60 
acaaactacc tacagagatt taaagctcta 4 920 
gtgttaaact actgattcta attgtttgtg 4 980 
aatgggagca gtggtggaat gccagatcca 5040 
caaaccacaa ctagaatgca gtgaaaaaaa 5100 
gctttatttg taaccattat aagctgcaat 5160 
tttatgtttc aggttcaggg ggaggtgtgg 5220 
aaatgtggta tggctgatta tgatctgcgg 5280 
taggttaatg toatgataat aatggtttct 5340 
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tagacgtcag 
taaatacatt 
tattgaaaaa 
gcggcatttt 
5 gaagatcagt 
cttgagagtt 
tgtggcgcgg 
tattctcaga 
atgacagtaa 

10 ttacttctga 
gatcatgtaa 
gagcgtgaca 
gaactactta 
gcaggaccac 

15 gccggtgagc 
cgtatcgtag 
atcgctgaga 
tatatacttt 
ctttttgata 

20 gaccccgtag 
tgcttgcaaa 
ccaactcttt 
ctagtgtagc 
gctctgctaa 

25 ttggactcaa 
tgcacacagc 
ctatgagaaa 
agggtcggaa 
agtcctgtcg 

30 gggcggagcc 
tggccttttg 
accgcctttg 
gtgagcgagg 
attcattaat 

35 gcaattaatg 
gctcgtatgt 
catgattacg 



gtggcacttt 
caaatatgta 
ggaagagtat 
gccttcctgt 
tgggtgcacg 
ttcgccccga 
tattatcccg 
atgacttggt 
gagaattatg 
caacgatcgg 
ctcgccttga 
ccacgatgcc 
ctctagcttc 
ttctgcgctc 
gtgggtctcg 
ttatctacac 
taggtgcctc 
agattgattt 
atctcatgac 
aaaagatcaa 
caaaaaaacc 
ttccgaaggt 
cgtagttagg 
tcctgttacc 
gacgatagtt 
ccagcttgga 
gcgccacgct 
caggagagcg 
ggtttcgcca 
tatggaaaaa 
ctcacatgtt 
agtgagctga 
aagcggaaga 
gcagctggca 
tgagttagct 
tgtgtggaat 
ccaagctggc 



tcggggaaat 
tccgctcatg 
gagtattcaa 
ttttgctcac 
agtgggttac 
agaacgtttt 
tattgacgcc 
tgagtactca 
cagtgctgcc 
aggaccgaag 
tcgttgggaa 
tgtagcaatg 
ccggcaacaa 
ggcccttccg 
cggtatcatt 
gacggggagt 
actgattaag 
aaaacttcat 
caaaatccct 
aggatcttct 
accgctacca 
aactggcttc 
ccaccacttc 
agtggctgct 
accggataag 
gcgaacgacc 
tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
ctttcctgcg 
taccgctcgc 
gcgcccaata 
cgacaggttt 
cactcattag 
tgtgagcgga 
gcg 



72 

gtgcgcggaa 
agacaataac 
catttccgtg 
ccagaaacgc 
atcgaactgg 
ccaatgatga 
gggcaagagc 
ccagtcacag 
ataaccatga 
gagctaaccg 
ccggagctga 
gcaacaacgt 
ttaatagact 
gctggctggt 
gcagcactgg 
caggcaacta 
cattggtaac 
ttttaattta 
taacgtgagt 
tgagatcctt 
gcggtggttt 
agcagagcgc 
aagaactctg 
gccagtggcg 
gcgcagcggt 
tacaccgaac 
agaaaggcgg 
cttccagggg 
gagcgtcgat 
gcggcctttt 
ttatcccctg 
cgcagccgaa 
cgcaaaccgc 
cccgactgga 
gcaccccagg 
taacaatttc 



cccctatttg 
cctgataaat 
tcgcccttat 
tggtgaaagt 
atctcaacag 
gcacttttaa 
aactcggtcg 
aaaagcatct 
gtgataacac 
cttttttgca 
atgaagccat 
tgcgcaaact 
ggatggaggc 
ttattgctga 
ggccagatgg 
tggatgaacg 
tgtcagacca 
aaaggatcta 
tttcgttcca 
tttttctgcg 
gtttgccgga 
agataccaaa 
tagcaccgcc 
ataagtcgtg 
cgggctgaac 
tgagatacct 
acaggtatcc 
gaaacgcctg 
ttttgtgatg 
tacggttcct 
attctgtgga 
cgaccgagcg 
ctctccccgc 
aagcgggcag 
ctttacactt 
acacaggaaa 



tttatttttc 
gcttcaataa 
tccctttttt 
aaaagatgct 
cggtaagatc 
agttctgcta 
ccgcatacac 
tacggatggc 
tgcggccaac 
caacatgggg 
accaaacgac 
attaactggc 
ggataaagtt 
taaatctgga 
taagccctcc 
aaatagacag 
agtttactca 
ggtgaagatc 
ctgagcgtca 
cgtaatctgc 
tcaagagcta 
tactgtcctt 
tacatacctc 
tcttaccggg 

ggggggttcg 

acagcgtgag 
ggtaagcggc 
gtatctttat 
ctcgtcaggg 
ggccttttgc 
taaccgtatt 
cagcgagtca 
gcgttggccg 
tgagcgcaac 
tatgcttccg 
cagctatgac 



5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7523 
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vector 



<400> 80 

50 ggccgcccga 
accgtcagtc 
ccaacgaaga 
ggttcgtcgg 
cggagttcga 

55 atgacgtgtc 
tgctcgccct 
tcatggacct 
agtcggcgaa 
ggaaggcgcc 

60 gaatggtcaa 
tcgagttcga 
ttcccttcaa 
agcgcatgga 
caagcgcctg 

65 tcgccgctga 
gttaccgcat 
tcatcgagcc 



tatgacacaa 
gcgcgagcgc 
caaggcggcc 
gcatttcagc 
acgcatcctg 
gcgcttctcg 
gggcgtgacg 
gattcacctg 
gattctcgac 
ttacggcttc 
tgtcgtcatc 
gcccgacgta 
gccgggcagt 
cgctgacgcc 
ggacccggca 
ggtgatctac 
tcagcgcgac 
cgctgagtgg 



ggggttgtga 
gagaattcga 
gaccttcagc 
gaagcgccgg 
aacgaatgcc 
cgcctgaagg 
attgtttcca 
attatgcggc 
acgaagaacc 
gagcttgttt 
aacaagcttg 
atccggtggt 
caagccgcca 
gtgccgaccc 
accgttatgc 
aagaagaagc 
ccgatcacgc 
tatgagcttc 



ccggggtgga 
gcgcagcaag 
gcgaagtcga 
gcacgtcggc 
gcgccgggcg 
tcatggacgc 
ctcaggaagg 
tcgacgcgtc 
ttcagcgcga 
cggagacgaa 
cgcactcgac 
ggtggcgtga 
ttcacccggg 
ggggcgagac 
gaatccttcg 
cggacggcac 
tccggccggt 
aggcgtggtt 



cacgtacgcg 

cccagcgaca 

gcgcgacggg 

gttcgggacg 

gctcaacatg 

gattccgatt 

cgtcttccgg 

gcacaaagaa 

attgggcggg 

ggagatcacg. 

cactcccctt 

gatcaagacg 

cagcatcacg 

gattgggaag 

ggacccgcgt 

gccgaccacg 

cgagcttgat 

ggacggcagg 



ggtgcttacg 
cagcgtagcg 
ggccggttca 
gcggagcgcc 
atcattgtct 
gtctcggaat 
cagggaaacg 
tcttcgctga 
tacgtcggcg 
cgcaacggcc 
accggaccct 
cacaaacacc 
gggctttgta 
aagaccgctt 
attgcgggct 
aagattgagg 
tgcggaccga 
gggcgcggca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 
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aggggctttc ccgggggcaa gccattctgt 
gcgccgtcat gacttcgaag cgcggggaag 
gccggaaggt ggtcgacccg tccgcacctg 
tggcggcact cgacaagttc gttgcggaac 
5 gcgacgaaga gacgttggcg cttctgtggg 
aggcgcctga gaagagcggc gaacgggcga 
acgcccttga agagctgtac gaagaccgcg 
ggaagcactt ccggaagcaa caggcagcgc 
ggcttgccga acttgaagcc gccgaagccc 
10 aagacgccga cgctgacccg accggcccta 
acaagcgcgt gttcgtcggg ctcttcgtag 
gcagggggca gggaacgccc atcgagaagc 
ccgacgacga cgaagacgac gcccaggacg 
ccgggctcga gggggggccc ggtacccagc 

15 agcttggcgt aatcatggtc atagctgttt 
ccacacaaca tacgagccgg aagcataaag 
taactcacat taattgcgtt gcgctcactg 
cagctgcatt aatgaatcgg ccaacgcgcg 
tccgcttcct cgctcactga ctcgctgcgc 

20 gctcactcaa aggcggtaat acggttatcc 
atgtgagcaa aaggccagca aaaggccagg 
ttccataggc tccgcccccc tgacgagcat 
cgaaacccga caggactata aagataccag 
tctcctgttc cgaccctgcc gcttaccgga 

25 gtggcgcttt ctcatagctc acgctgtagg 
aagctgggct gtgtgcacga accccccgtt 
tatcgtcttg agtccaaccc ggtaagacac 
aacaggatta gcagagcgag gtatgtaggc 
aactacggct acactagaag gacagtattt 

30 ttcggaaaaa gagttggtag ctcttgatcc 
ttttttgttt gcaagcagca gattacgcgc 
atcttttcta cggggtctga cgctcagtgg 
atgagattat caaaaaggat cttcacctag 
tcaatctaaa gtatatatga gtaaacttgg 

35 gcacctatct cagcgatctg tctatttcgt 
tagataacta cgatacggga gggcttacca 
gacccacgct caccggctcc agatttatca 
cgcagaagtg gtcctgcaac tttatccgcc 
gctagagtaa gtagttcgcc agttaatagt 

40 atcgtggtgt cacgctcgtc gtttggtatg 
aggcgagtta catgatcccc catgttgtgc 
atcgttgtca gaagtaagtt ggccgcagtg 
aattctctta ctgtcatgcc atccgtaaga 
aagtcattct gagaatagtg tatgcggcga 

45 gataataccg cgccacatag cagaacttta 
gggcgaaaac tctcaaggat cttaccgctg 
gcacccaact gatcttcagc atcttttact 
ggaaggcaaa atgccgcaaa aaagggaata 
ctcttccttt ttcaatatta ttgaagcatt 

50 atatttgaat gtatttagaa aaataaacaa 
gtgccaccta aattgtaagc gt'taatattt 
tcagctcatt ttttaaccaa taggccgaaa 
agaccgagat agggttgagt gttgttccag 
. tggactccaa cgtcaaaggg cgaaaaaccg 

55 catcacccta atcaagtttt ttggggtcga 
aagggagccc ccgatttaga gcttgacggg 
ggaagaaagc gaaaggagcg ggcgctaggg 
taaccaccac acccgccgcg cttaatgcgc 
ggctgcgcaa ctgttgggaa gggcgatcgg 

60 cgaaaggggg atgtgctgca aggcgattaa 
gacgttgtaa aacgacggcc agtgaattgt 
tccaccgcgg tggc 
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ccgccatgga caagctgtac tgcgagtgtg 1140 
aatcgatcaa ggactcttac cgctgccgtc 1200 
ggcagcacga aggcacgtgc aacgtcagca 1260 
gcatcttcaa caagatcagg cacgccgaag 1320 
aagccgcccg acgcttcggc aagctcactg 1380 
accttgttgc ggagcgcgcc gacgccctga 1440 
cggcaggcgc gtacgacgga cccgttggca 1500 
tgacgctccg gcagcaaggg gcggaagagc 1560 
cgaagcttcc ccttgaccaa tggttccccg 1620 
agtcgtggtg ggggcgcgcg tcagtagacg 1680 
acaagatcgt tgtcacgaag tcgactacgg 17 40 
gcgcttcgat cacgtgggcg aagccgccga 1800 
gcacggaaga cgtagcggcg taggcggcgc 18 60 
ttttgttccc tttagtgagg gttaatttcg 1920 
cctgtgtgaa attgttatcc gctcacaatt 1980 
tgtaaagcct ggggtgccta atgagtgagc 2040 
cccgctttcc agtcgggaaa cctgtcgtgc 2100 
gggagaggcg gtttgcgtat tgggcgctct 2160 
tcggtcgttc ggctgcggcg agcggtatca 2220 
acagaatcag gggataacgc aggaaagaac 2280 
aaccgtaaaa aggccgcgtt gctggcgttt 2340 
cacaaaaatc gacgctcaag tcagaggtgg 24 00 
gcgtttcccc ctggaagctc cctcgtgcgc 24 60 
tacctgtccg cctttctccc ttcgggaagc 2520 
tatctcagtt cggtgtaggt cgttcgctcc 2580 
cagcccgacc gctgcgcctt atccggtaac 2640 
gacttatcgc cactggcagc agccactggt 2700 
ggtgctacag agttcttgaa gtggtggcct 27 60 
ggtatctgcg ctctgctgaa gccagttacc 2820 
ggcaaacaaa ccaccgctgg tagcggtggt 2880 
agaaaaaaag gatctcaaga agatcctttg 2940 
aacgaaaact cacgttaagg gattttggtc 3000 
atccttttaa attaaaaatg aagttttaaa 3060 
tctgacagtt accaatgctt aatcagtgag 3120 
tcatccatag ttgcctgact ccccgtcgtg 3180 
tctggcccca gtgctgcaat gataccgcga 3240 
gcaataaacc agccagccgg aagggccgag 3300 
tccatccagt ctattaattg ttgccgggaa 3360 
ttgcgcaacg ttgttgccat tgctacaggc 3420 
gcttcattca gctccggttc ccaacgatca 3480 
aaaaaagcgg ttagctcctt cggtcctccg 3540 
ttatcactca tggttatggc agcactgcat 3600 
tgcttttctg tgactggtga gtactcaacc 3660 
ccgagttgct cttgcccggc gtcaatacgg 372'0 
aaagtgctca tcattggaaa acgttcttcg 3780 
ttgagatcca gttcgatgta acccactcgt 3840 
ttcaccagcg tttctgggtg agcaaaaaca 3900 
agggcgacac ggaaatgttg aatactcata 3960 
tatcagggtt attgtctcat gagcggatac 4 020 
ataggggttc cgcgcacatt tccccgaaaa 4080 
tgttaaaatt cgcgttaaat ttttgttaaa 4140 
tcggcaaaat cccttataaa tcaaaagaat 4200 
tttggaacaa gagtccacta ttaaagaacg 4260 
tctatcaggg cgatggccca ctacgtgaac 4320 
ggtgccgtaa agcactaaat cggaacccta 4 380 
gaaagccggc gaacgtggcg agaaaggaag 4 4 40 
cgctggcaag tgtagcggtc acgctgcgcg 4 500 
cgctacaggg cgcgtcccat tcgccattca 4 560 
tgcgggcctc ttcgctatta cgccagctgg 4 620 
gttgggtaac gccagggttt tcccagtcac 4 680 
aatacgactc actatagggc gaattggagc 4740- 

4754 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pRK63pbsC31NLS-Re 

<400> 81 

ggccgcacca tgcccaagaa gaagaggaag gtgacacaag gggttgtgac cggggtggac 60 
acgtacgcgg gtgcttacga ccgtcagtcg cgcgagcgcg agaattcgag cgcagcaagc 120 
ccagcgacac agcgtagcgc caacgaagac aaggcggccg accttcagcg cgaagtcgag 180 
cgcgacgggg gccggttcag gttcgtcggg catttcagcg aagcgccggg cacgtcggcg 240 
ttcgggacgg cggagcgccc ggagttcgaa cgcatcctga acgaatgccg cgccgggcgg 300 
ctcaacatga tcattgtcta tgacgtgtcg cgcttctcgc gcctgaaggt catggacgcg 360 
attccgattg tctcggaatt gctcgccctg ggcgtgacga ttgtttccac tcaggaaggc 420 
gtcttccggc agggaaacgt catggacctg attcacctga ttatgcggct cgacgcgtcg 4 80 
cacaaagaat cttcgctgaa gtcggcgaag attctcgaca cgaagaacct tcagcgcgaa 540 
ttgggcgggt acgtcggcgg gaaggcgcct tacggcttcg agcttgtttc ggagacgaag 600 
gagatcacgc gcaacggccg aatggtcaat gtcgtcatca acaagcttgc gcactcgacc 660 
actcccctta ccggaccctt cgagttcgag cccgacgtaa tccggtggtg gtggcgtgag 720 
atcaagacgc acaaacacct tcccttcaag ccgggcagtc aagccgccat tcacccgggc 780 
agcatcacgg ggctttgtaa gcgcatggac gctgacgccg tgccgacccg gggcgagacg 84 0 
attgggaaga agaccgcttc aagcgcctgg gacccggcaa ccgttatgcg aatccttcgg 900 
gacccgcgta ttgcgggctt cgccgctgag gtgatctaca agaagaagcc ggacggcacg 960 
ccgaccacga agattgaggg ttaccgcatt cagcgcgacc cgatcacgct ccggccggtc 1020 
gagcttgatt gcggaccgat catcgagccc gctgagtggt atgagcttca ggcgtggttg 1080 
gacggcaggg ggcgcggcaa ggggctttcc cgggggcaag ccattctgtc cgccatggac 1140 
aagctgtact gcgagtgtgg cgccgtcatg acttcgaagc gcggggaaga atcgatcaag 1200 
gactcttacc gctgccgtcg ccggaaggtg gtcgacccgt ccgcacctgg gcagcacgaa 1260 
ggcacgtgca acgtcagcat ggcggcactc gacaagttcg ttgcggaacg catcttcaac 1320 
aagatcaggc acgccgaagg cgacgaagag acgttggcgc ttctgtggga agccgcccga 1380 
cgcttcggca agctcactga ggcgcctgag aagagcggcg aacgggcgaa ccttgttgcg 1440 
gagcgcgccg acgccctgaa cgcccttgaa gagctgtacg aagaccgcgc ggcaggcgcg 1500 
tacgacggac ccgttggcag gaagcacttc cggaagcaac aggcagcgct gacgctccgg 1560 
cagcaagggg cggaagagcg gcttgccgaa cttgaagccg ccgaagcccc gaagcttccc 1620 
cttgaccaat ggttccccga agacgccgac gctgacccga ccggccctaa gtcgtggtgg 1680 
gggcgcgcgt cagtagacga caagcgcgtg ttcgtcgggc tcttcgtaga caagatcgtt 1740 
gtcacgaagt cgactacggg cagggggcag ggaacgccca tcgagaagcg cgcttcgatc 1800 
acgtgggcga agccgccgac cgacgacgac gaagacgacg cccaggacgg cacggaagac 1860 
gtagcggcgt aggcggcgcc cgggctcgag ggggggcccg gtacccagct tttgttccct 1920 
ttagtgaggg ttaatttcga gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa 1980 
ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt gtaaagcctg 2040 
gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc ccgctttcca 2100 
gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg ggagaggcgg 2160 
tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 2220 
gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 2280 
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 2340 
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 24 00 
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 24 60 
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 2520 
ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 2580 
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 2640 
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 2700 
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 2760 
gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc 2820 
tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 2880 
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 2940 
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 3000 
acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 3060 
ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 3120 
ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt 3180 
tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag 3240 
tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag caataaacca 3300 
gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc 3360 
tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt 3420 
tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag 3480 
ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt 3540 
tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt tatcactcat 3600 
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ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt 3660 

gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc 3720 

ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa aagtgctcat 3780 

cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag 3840 

5 ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt 3900 

ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg 3960 

gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta 4020 

ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc 4080 

gcgcacattt ccccgaaaag tgccacctaa attgtaagcg ttaatatttt gttaaaattc 4140 

10 gcgttaaatt tttgttaaat cagctcattt tttaaccaat aggccgaaat cggcaaaatc 4200 

ccttataaat caaaagaata gaccgagata gggttgagtg ttgttccagt ttggaacaag 4260 

agtccactat taaagaacgt ggactccaac gtcaaagggc gaaaaaccgt ctatcagggc 4 320 

gatggcccac tacgtgaacc atcaccctaa tcaagttttt tggggtcgag gtgccgtaaa 4380 

gcactaaatc ggaaccctaa agggagcccc cgatttagag cttgacgggg aaagccggcg 44 40 

15 aacgtggcga gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt 4500 

gtagcggtca cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc 4560 

gcgtcccatt cgccattcag gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct 4 620 

tcgctattac gccagctggc gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg 4 680 

ccagggtttt cccagtcacg acgttgtaaa acgacggcca gtgaattgta atacgactca 4740 

20 ctatagggcg aattggagct ccaccgcggt ggc 4773 



<210> 82 
<211> 7803 
25 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
30 pPGKattAl 

<400> 82 

tatcatgtct ggatccgcgt taacacctaa gaaggcgaag ttttccttac accttgcaga 60 
tataaagtgt ctaacagttt aaaatatccg atgaggcata tttatgttgg acccgtagct 120 

35 cagccaggat agagcactgg cctccggagc cggaggtccc gggttcaaat cccggcgggt 180 
ccgtatatta ctttttgatt cagattagat ttgtaaatct ttattacaag gataatttga 240 
tcttgtatat tggtaactct ctactctata atttttatga gaaattcaca gtcgtccctt 300 
tataccataa atagctaagt ttgtcaaagt tcttattaaa ctctccatgt agagattaaa 360 
tcggatccat aacttcgtat agcatacatt atacgaagtt ataccgggcc accatggtcg 420 

40 cgagtagctt ggcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta 480 
cccaacttaa tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg 54 0 
cccgcaccga tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg cgctttgcct 600 
ggtttccggc accagaagcg gtgccggaaa gctggctgga gtgcgatctt cctgaggccg 660 
atactgtcgt cgtcccctca aactggcaga tgcacggtta cgatgcgccc atctacacca 720 

45 acgtaaccta tcccattacg gtcaatccgc cgtttgttcc cacggagaat ccgacgggtt 780 
gttactcgct cacatttaat gttgatgaaa gctggctaca ggaaggccag acgcgaatta 84 0 
tttttgatgg cgttaactcg gcgtttcatc tgtggtgcaa cgggcgctgg gtcggttacg 900 
gccaggacag tcgtttgccg tctgaatttg acctgagcgc atttttacgc gccggagaaa 960 
accgcctcgc ggtgatggtg ctgcgttgga gtgacggcag ttatctggaa gatcaggata 1020 

50 tgtggcggat gagcggcatt ttccgtgacg tctcgttgct gcataaaccg actacacaaa 1080 
tcagcgattt ccatgttgcc actcgcttta atgatgattt cagccgcgct gtactggagg 1140 
ctgaagttca gatgtgcggc gagttgcgtg actacctacg ggtaacagtt tctttatggc 1200 
agggtgaaac gcaggtcgcc agcggcaccg cgcctttcgg cggtgaaatt atcgatgagc 1260 
gtggtggtta tgccgatcgc gtcacactac gtctgaacgt cgaaaacccg aaactgtgga 1320 

55 gcgccgaaat cccgaatctc tatcgtgcgg tggttgaact gcacaccgcc gacggcacgc 1380 
tgattgaagc agaagcctgc gatgtcggtt tccgcgaggt gcggattgaa aatggtctgc 1440 
tgctgctgaa cgg.caagccg ttgctgattc gaggcgttaa ccgtcacgag catcatcctc 1500 
tgcatggtca ggtcatggat gagcagacga tggtgcagga tatcctgctg atgaagcaga 1560 
acaactttaa cgccgtgcgc tgttcgcatt atccgaacca tccgctgtgg tacacgctgt 1620 

60 gcgaccgcta cggcctgtat gtggtggatg aagccaatat tgaaacccac ggcatggtgc 1680 
caatgaatcg tctgaccgat gatccgcgct ggctaccggc gatgagcgaa cgcgtaacgc 1740 
gaatggtgca gcgcgatcgt aatcacccga gtgtgatcat ctggtcgctg gggaatgaat 1800 
caggccacgg cgctaatcac gacgcgctgt atcgctggat caaatctgtc gatccttccc 1860 
gcccggtgca gtatgaaggc ggcggagccg acaccacggc caccgatatt atttgcccga 1920 

65 tgtacgcgcg cgtggatgaa gaccagccct tcccggctgt gccgaaatgg tccatcaaaa 1980 
aatggctttc gctacctgga gagacgcgcc cgctgatcct ttgcgaatac gcccacgcga 2040 
tgggtaacag tcttggcggt ttcgctaaat actggcaggc gtttcgtcag tatccccgtt 2100 
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tacagggcgg cttcgtctgg gactgggtgg 
gcaacccgtg gtcggcttac ggcggtgatt 
gtatgaacgg tctggtcttt gccgaccgca 
accagcagca gtttttccag ttccgtttat 
5 acctgttccg tcatagcgat aacgagctcc 
cgctggcaag cggtgaagtg cctctggatg 
tgcctgaact accgcagccg gagagcgccg 
aaccgaacgc gaccgcatgg tcagaagccg 
tggcggaaaa cctcagtgtg acgctccccg 

10 ccagcgaaat ggatttttgc atcgagctgg 
caggctttct ttcacagatg tggattggcg 
atcagttcac ccgtgcaccg ctggataacg 
accctaacgc ctgggtcgaa cgctggaagg 
tgttgcagtg cacggcagat acacttgctg 

15 ggcagcatca ggggaaaacc ttatttatca 
gtcaaatggc gattaccgtt gatgttgaag 
ttggcctgaa ctgccagctg gcgcaggtag 
cgcaagaaaa ctatcccgac cgccttactg 
tgtcagacat gtataccccg tacgtcttcc 

20 gcgaattgaa ttatggccca caccagtggc 
acagtcaaca gcaactgatg gaaaccagcc 
catggctgaa tatcgacggt ttccatatgg 
cagtatcggc ggaattccag ctgagcgccg 
aaaaataata ataaccgggc aggggggatc 

25 acataattgg acaaactacc tacagagatt 
agtgtataat gtgttaaact actgattcta 
ggaactgatg aatgggagca gtggtggaat 
tgagtttgga caaaccacaa ctagaatgca 
tgatgctatt gctttatttg taaccattat 

30 ttgcattcat tttatgtttc aggttcaggg 
aaacctctac aaatgtggta tggctgatta 
cctattttta taggttaatg tcatgataat 
tcggggaaat gtgcgcggaa cccctatttg 
tccgctcatg agacaataac cctgataaat 

35 gagtattcaa catttccgtg tcgcccttat 
ttttgctcac ccagaaacgc tggtgaaagt 
agtgggttac atcgaactgg atctcaacag 
agaacgtttt ccaatgatga gcacttttaa 
tattgacgcc gggcaagagc aactcggtcg 

40 tgagtactca ccagtcacag aaaagcatct 
cagtgctgcc ataaccatga gtgataacac 
aggaccgaag gagctaaccg cttttttgca 
tcgttgggaa ccggagctga atgaagccat 
tgtagcaatg gcaacaacgt tgcgcaaact 

45 ccggcaacaa ttaatagact ggatggaggc 
ggcccttccg gctggctggt ttattgctga 
cggtatcatt gcagcactgg ggccagatgg 
gacggggagt caggcaacta tggatgaacg 
actgattaag cattggtaac tgtcagacca 

50 aaaacttcat ttttaattta aaaggatcta 
caaaatccct taacgtgagt tttcgttcca 
aggatcttct tgagatcctt tttttctgcg 
accgctacca gcggtggttt gtttgccgga 
aactggcttc agcagagcgc agataccaaa 

55 ccaccacttc aagaactctg tagcaccgcc 
agtggctgct gccagtggcg ataagtcgtg 
accggataag gcgcagcggt cgggctgaac 
gcgaacgacc tacaccgaac tgagatacct 
tcccgaaggg agaaaggcgg acaggtatcc 

60 cacgagggag cttccagggg gaaacgcctg 
cctctgactt gagcgtcgat ttttgtgatg 
cgccagcaac gcggcctttt tacggttcct 
ctttcctgcg ttatcccctg attctgtgga 
taccgctcgc cgcagccgaa cgaccgagcg 

65 gcgcccaata cgcaaaccgc ctctccccgc 
cgacaggttt cccgactgga aagcgggcag 
cactcattag gcaccccagg ctttacactt 
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atcagtcgct gattaaatat gatgaaaacg 2160 
ttggcgatac gccgaacgat cgccagttct 2220 
cgccgcatcc agcgctgacg gaagcaaaac 2280 
ccgggcaaac catcgaagtg accagcgaat 2340 
tgcactggat ggtggcgctg gatggtaagc 2400 
tcgctccaca aggtaaacag ttgattgaac 24 60 
ggcaactctg gctcacagta cgcgtagtgc 2520 
ggcacatcag cgcctggcag cagtggcgtc 2580 
ccgcgtccca cgccatcccg catctgacca 264 0 
gtaataagcg ttggcaattt aaccgccagt 2700 
ataaaaaaca actgctgacg ccgctgcgcg 27 60 
acattggcgt aagtgaagcg acccgcattg 2820 
cggcgggcca ttaccaggcc gaagcagcgt 2880 
atgcggtgct gattacgacc gctcacgcgt 2940 
gccggaaaac ctaccggatt gatggtagtg 3000 
tggcgagcga tacaccgcat ccggcgcgga 3060 
cagagcgggt aaactggctc ggattagggc 3120 
ccgcctgttt tgaccgctgg gatctgccat 3180 
cgagcgaaaa cggtctgcgc tgcgggacgc 324 0 
gcggcgactt ccagttcaac atcagccgct 3300 
atcgccatct gctgcacgcg gaagaaggca 3360 
ggattggtgg cgacgactcc tggagcccgt 3420 
gtcgctacca ttaccagttg gtctggtgtc 3480 
tttgtgaagg aaccttactt ctgtggtgtg 3540 
taaagctcta aggtaaatat aaaattttta 3600 
attgtttgtg tattttagat tccaacctat 3660 
gccagatcca gacatgataa gatacattga 3720 
gtgaaaaaaa tgctttattt gtgaaatttg 3780 
aagctgcaat aaacaagtta acaacaacaa 384 0 
ggaggtgtgg gaggtttttt aaagcaagta 3900 
tgatctgcgg ccgcagggcc tcgtgatacg 3960 
aatggtttct tagacgtcag gtggcacttt 4020 
tttatttttc taaatacatt caaatatgta 4080 
gcttcaataa tattgaaaaa ggaagagtat 414 0 
tccctttttt gcggcatttt gccttcctgt 4200 
aaaagatgct gaagatcagt tgggtgcacg 4260 
cggtaagatc cttgagagtt ttcgccccga 4320 
agttctgcta tgtggcgcgg tattatcccg 4380 
ccgcatacac tattctcaga atgacttggt 444 0 
tacggatggc atgacagtaa gagaattatg 4500 
tgcggccaac ttacttctga caacgatcgg 4560 
caacatgggg gatcatgtaa ctcgccttga 4 620 
accaaacgac gagcgtgaca ccacgatgcc 4 680 
attaactggc gaactactta ctctagcttc 4740 
ggataaagtt gcaggaccac ttctgcgctc 4800 
taaatctgga gccggtgagc gtgggtctcg 4860 
taagccctcc cgtatcgtag ttatctacac 4 920 
aaatagacag atcgctgaga taggtgcctc 4 980 
agtttactca tatatacttt agattgattt 5040 
ggtgaagatc ctttttgata atctcatgac 5100 
ctgagcgtca gaccccgtag aaaagatcaa 5160 
cgtaatctgc tgcttgcaaa caaaaaaacc 5220 
tcaagagcta ccaactcttt ttccgaaggt 5280 
tactgtcctt ctagtgtagc cgtagttagg 534 0 
tacatacctc gctctgctaa tcctgttacc 5400 
tcttaccggg ttggactcaa gacgatagtt 54 60 
ggggggttcg tgcacacagc ccagcttgga 5520 
acagcgtgag ctatgagaaa gcgccacgct 5580 
ggtaagcggc agggtcggaa caggagagcg 5640 
gtatctttat agtcctgtcg ggtttcgcca 5700 
ctcgtcaggg gggcggagcc tatggaaaaa 57 60 
ggccttttgc tggccttttg ctcacatgtt 5820 
taaccgtatt accgcctttg agtgagctga 5880 
cagcgagtca gtgagcgagg aagcggaaga 594 0 
gcgttggccg attcattaat gcagctggca 6000 
tgagcgcaac gcaattaatg tgagttagct 6060 
tatgcttccg gctcgtatgt tgtgtggaat 6120 
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tgtgagcgga 
gcgtcgagga 
ctttagcagc 
cacatccacc 
5 tactcctccc 
acaaatggaa 
gaagcgggta 
tcagaggctg 
gggcgcccga 

10 cgcgctgttc 
ataacttcgt 
ccacgaccgg 
ggagacgacc 
tcccccgggc 

15 ccgtcgaccc 
gcgtcgggct 
ggaccacgcc 
ccgagttgag 
accggcccaa 

20 gcaagggtct 
tgcccgcctt 
tcaccgtcac 
agcccggtgc 
cccatggctc 

25 gcccaccgac 
ttaaaaaacc 
gttaacttgt 
acaaataaag 
■ tct 

30 



taacaatttc 
attctaccgg 
cccgctgggc 
ggtaggcgcc 
ctagtcagga 
gtagcacgtc 
ggcctttggg 
ggaaggggtg 
aggtcctccg 
tcctcttcct 
atagcataca 
ccggccggtg 
ttccatgacc 
cgtacgcacc 
ggaccgccac 
cgacatcggc 
ggagagcgtc. 
cggttcccgg 
ggagcccgcg 
gggcagcgcc 
cctggagacc 
cgccgacgtc 
ctgacgcccg 
cgaccgaagc 
tctagaggat 
tcccacacct 
ttattgcagc 
catttttttc 



acacaggaaa 
gtaggggagg 
acttggcgct 
aaccggctcc 
agttcccccc 
tcactagtct 
gcagcggcca 
ggtccggggg 
gaggcccggc 
catctccggg 
ttatacgaag 
ccgccaccat 
gagtacaagc 
ctcgccgccg 
atcgagcggg 
aaggtgtggg 
gaagcggggg 
ctggccgcgc 
tggttcctgg 
gtcgtgctcc 
tccgcgcccc 
gagtgcccga 
ccccacgacc 
cgacccgggc 
cataatcagc 
ccccctgaac 
ttataatggt 
actgcattct 



77 

cagctatgac 
cgcttttccc 
acacaagtgg 
gttctttggt 
cgccccgcag 
cgtgcagatg 
atagcagctt 
cgggctcagg 
attctgcacg 
cctttcgacc 
ttataagctt 
cccctgaccc 
ccacggtgcg 
cgttcgccga 
tcaccgagct 
tcgcggacga 
cggtgttcgc 
agcaacagat 
ccaccgtcgg 
ccggagtgga 
gcaacctccc 
aggaccgcgc 
cgcagcgccc 
ggccccgccg 
cataccacat 
ctgaaacata 
tacaaataaa 
agttgtggtt 



catgattacg 
aaggcagtct 
cctctggctc 
ggccccttcg 
ctcgcgtcgt 
gacagcaccg 
tgctccttcg 
ggcgggctca 
cttcaaaagc 
tgcagcccgg 
gcatgcctgc 
acgcccctga 
cctcgccacc 
ctaccccgcc 
gcaagaactc 
cggcgccgcg 
cgagatcggc 
ggaaggcctc 
cgtctcgccc 
ggcggccgag 
cttctacgag 
gacctggtgc 
gaccgaaagg 
accccgcacc 
ttgtagaggt 
aaatgaatgc 
gcaatagcat 
tgtccaaact 



ccaagctggc 
ggagcatgcg 
gcacacattc 
cgccaccttc 
gcaggacgtg 
ctgagcaatg 
ctttctgggc 
ggggcggggc 
gcacgtctgc 
tacagttcga 
aggtcggccg 
cccctcacaa 
cgcgacgacg 
acgcgccaca 
ttcctcacgc 
gtggcggtct 
ccgcgcatgg 
ctggcgccgc 
gaccaccagg 
cgcgccgggg 
cggctcggct 
atgacccgca 
agcgcacgac 
cgcccccgag 
tttacttgct 
aattgttgtt 
cacaaatttc 
catcaatgta 



6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7803 



35 



40 



<210> 83 
<211> 8167 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pPGKattA2 



<400> 83- 
tatcatgtct 
tataaagtgt 
cagccaggat 

45 ccgtatatta 
tcttgtatat 
tataccataa 
tcggatccat 
cgagtagctt 

50 cccaacttaa 
cccgcaccga 
ggtttccggc 
atactgtcgt 
acgtaaccta 

55 gttactcgct 
tttttgatgg 
gccaggacag 
accgcctcgc 
tgtggcggat 

60 tcagcgattt 
ctgaagttca 
agggtgaaac 
gtggtggtta 
gcgccgaaat 

65 tgattgaagc 
tgctgctgaa 
tgcatggtca 



ggatccgcgt 
ctaacagttt 
agagcactgg 
ctttttgatt 
tggtaactct 
atagctaagt 
aacttcgtat 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
accagaagcg 
cgtcccctca 
tcccattacg 
cacatttaat 
cgttaactcg 
tcgtttgccg 
ggtgatggtg 
gagcggcatt 
ccatgttgcc 
gatgtgcggc 
gcaggtcgcc 
tgccgatcgc 
cccgaatctc 
agaagcctgc 
cggcaagccg 
ggtcatggat 



taacacctaa 
aaaatatccg 
cctccggagc 
cagattagat 
ctactctata 
ttgtcaaagt 
agcatacatt 
gtcgttttac 
gcacatcccc 
caacagttgc 
gtgccggaaa 
aactggcaga 
gtcaatccgc 
gttgatgaaa 
gcgtttcatc 
tctgaatttg 
ctgcgttgga 
ttccgtgacg 
actcgcttta 
gagttgcgtg 
agcggcaccg 
gtcacactac 
tatcgtgcgg 
gatgtcggtt 
ttgctgattc 
gagcagacga 



gaaggcgaag 
atgaggcata 
cggaggtccc 
ttgtaaatct 
atttttatga 
tcttattaaa 
atacgaagtt 
aacgtcgtga 
ctttcgccag. 
gcagcctgaa 
gctggctgga 
tgcacggtta 
cgtttgttcc 
gctggctaca 
tgtggtgcaa 
acctgagcgc 
gtgacggcag 
tctcgttgct 
atgatgattt 
actacctacg 
cgcctttcgg 
gtctgaacgt 
tggttgaact 
tccgcgaggt 
gaggcgttaa 
tggtgcagga 



ttttccttac 
tttatgttgg 
gggttcaaat 
ttattacaag 
gaaattcaca 
ctctccatgt 
ataccgggcc 
ctgggaaaac 
ctggcgtaat 
tggcgaatgg 
gtgcgatctt 
cgatgcgccc 
cacggagaat 
ggaaggccag 
cgggcgctgg 
atttttacgc 
ttatctggaa 
gcataaaccg 
cagccgcgct 
ggtaacagtt 
cggtgaaatt 
cgaaaacccg 
gcacaccgcc 
gcggattgaa 
ccgtcacgag 
tatcctgctg 



accttgcaga 
acccgtagct 
cccggcgggt 
gataatttga 
gtcgtccctt 
agagattaaa 
accatggtcg 
cctggcgtta 
agcgaagagg 
cgctttgcct 
cctgaggccg 
atctacacca 
ccgacgggtt 
acgcgaatta 
gtcggttacg 
gccggagaaa 
gatcaggata 
actacacaaa 
gtactggagg 
tctttatggc 
atcgatgagc 
aaactgtgga 
gacggcacgc 
aatggtctgc 
catcatcctc 
atgaagcaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 
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acaactttaa cgccgtgcgc tgttcgcatt 
gcgaccgcta cggcctgtat gtggtggatg 
caatgaatcg tctgaccgat gatccgcgct 
gaatggtgca gcgcgatcgt aatcacccga 
5 caggccacgg cgctaatcac gacgcgctgt 
gcccggtgca gtatgaaggc ggcggagccg 
tgtacgcgcg cgtggatgaa gaccagccct 
aatggctttc gctacctgga gagacgcgcc 
tgggtaacag tcttggcggt ttcgctaaat 

10 tacagggcgg cttcgtctgg gactgggtgg 
gcaacccgtg gtcggcttac ggcggtgatt 
gtatgaacgg tctggtcttt gccgaccgca 
accagcagca gtttttccag ttccgtttat 
acctgttccg tcatagcgat aacgagctcc 

15 cgctggcaag cggtgaagtg cctctggatg 
tgcctgaact accgcagccg gagagcgccg 
aaccgaacgc gaccgcatgg tcagaagccg 
tggcggaaaa cctcagtgtg acgctccccg 
ccagcgaaat ggatttttgc atcgagctgg 

20 caggctttct ttcacagatg tggattggcg 
atcagttcac ccgtgcaccg ctggataacg 
accctaacgc ctgggtcgaa cgctggaagg 
tgttgcagtg cacggcagat acacttgctg 
ggcagcatca ggggaaaacc ttatttatca 

25 gtcaaatggc gattaccgtt gatgttgaag 
ttggcctgaa ctgccagctg gcgcaggtag. 
cgcaagaaaa ctatcccgac cgccttactg 
tgtcagacat gtataccccg tacgtcttcc 
gcgaattgaa ttatggccca caccagtggc 

30 acagtcaaca gcaactgatg gaaaccagcc 
catggctgaa tatcgacggt ttccatatgg 
cagtatcggc ggaattccag ctgagcgccg 
aaaaataata ataaccgggc aggggggatc 
acataattgg acaaactacc tacagagatt 

•35 agtgtataat gtgttaaact actgattcta 
ggaactgatg aatgggagca gtggtggaat 
tgagtttgga caaaccacaa ctagaatgca 
tgatgctatt gctttatttg taaccattat 
ttgcattcat tttatgtttc aggttcaggg 

40 aaacctctac aaatgtggta tggctgatta 
cctattttta taggttaatg tcatgataat 
tcggggaaat gtgcgcggaa cccctatttg 
tccgctcatg agacaataac cctgataaat 
gagtattcaa catttccgtg tcgcccttat 

45 ttttgctcac ccagaaacgc tggtgaaagt 
agtgggttac atcgaactgg atctcaacag 
agaacgtttt ccaatgatga gcacttttaa 
tattgacgcc gggcaagagc aactcggtcg 
tgagtactca ccagtcacag aaaagcatct 

50 cagtgctgcc ataaccatga gtgataacac 
aggaccgaag gagctaaccg cttttttgca 
tcgttgggaa ccggagctga atgaagccat 
tgtagcaatg gcaacaacgt tgcgcaaact 
ccggcaacaa ttaatagact ggatggaggc 

55 ggcccttccg gctggctggt ttattgctga 
cggtatcatt gcagcactgg ggccagatgg 
gacggggagt caggcaacta tggatgaacg 
actgattaag cattggtaac tgtcagacca 
aaaacttcat ttttaattta aaaggatcta 

60 caaaatccct taacgtgagt tttcgttcca 
aggatcttct tgagatcctt tttttctgcg 
accgctacca gcggtggttt gtttgccgga 
aactggcttc agcagagcgc agataccaaa 
ccaccacttc aagaactctg tagcaccgcc 

65 agtggctgct gccagtggcg ataagtcgtg 
accggataag gcgcagcggt cgggctgaac 
gcgaacgacc tacaccgaac tgagatacct 
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atccgaacca tccgctgtgg tacacgctgt 1620 
aagccaatat tgaaacccac ggcatggtgc 1680 
ggctaccggc gatgagcgaa cgcgtaacgc 1740 
gtgtgatcat ctggtcgctg gggaatgaat 1800 
atcgctggat caaatctgtc gatccttccc 1860 
acaccacggc caccgatatt atttgcccga 1920 
tcccggctgt gccgaaatgg tccatcaaaa 1980 
cgctgatcct ttgcgaatac gcccacgcga 2040 
actggcaggc gtttcgtcag tatccccgtt 2100 
atcagtcgct gattaaatat gatgaaaacg 2160 
ttggcgatac gccgaacgat cgccagttct 2220 
cgccgcatcc agcgctgacg gaagcaaaac 2280 
ccgggcaaac catcgaagtg accagcgaat 2340 
tgcactggat ggtggcgctg gatggtaagc 2400 
tcgctccaca aggtaaacag ttgattgaac 24 60 
ggcaactctg gctcacagta cgcgtagtgc 2520 
ggcacatcag cgcctggcag cagtggcgtc 2580 
ccgcgtccca cgccatcccg catctgacca 2640 
gtaataagcg ttggcaattt aaccgccagt 2700 
ataaaaaaca actgctgacg ccgctgcgcg 27 60 
acattggcgt aagtgaagcg acccgcattg 2820 
cggcgggcca ttaccaggcc gaagcagcgt 2880 
atgcggtgct gattacgacc gctcacgcgt 2940 
gccggaaaac ctaccggat't gatggtagtg 3000 
tggcgagcga tacaccgcat ccggcgcgga 3060 
cagagcgggt aaactggctc ggattagggc 3120 
ccgcctgttt tgaccgctgg gatctgccat 3180 
cgagcgaaaa cggtctgcgc tgcgggacgc 3240 
gcggcgactt ccagttcaac atcagccgct 3300 
atcgccatct gctgcacgcg gaagaaggca 3360 
ggattggtgg cgacgactcc tggagcccgt 3420 
gtcgctacca ttaccagttg gtctggtgtc 3480 
tttgtgaagg aaccttactt ctgtggtgtg 3540 
taaagctcta aggtaaatat aaaattttta 3600 
attgtttgtg tattttagat tccaacctat 3660 
gccagatcca gacatgataa gatacattga 3720 
gtgaaaaaaa tgctttattt gtgaaatttg 3780 
aagctgcaat aaacaagtta acaacaacaa 38 40 
ggaggtgtgg gaggtttttt aaagcaagta 3900 
tgatctgcgg ccgcagggcc tcgtgatacg 3960 
aatggtttct tagacgtcag gtggcacttt 4020 
tttatttttc taaatacatt caaatatgta 4080 
gcttcaataa tattgaaaaa ggaagagtat 4140 
tccctttttt gcggcatttt gccttcctgt 4200 
aaaagatgct gaagatcagt tgggtgcacg 4260 
cggtaagatc cttgagagtt ttcgccccga 4320 
agttctgcta tgtggcgcgg tattatcccg 4380 
ccgcatacac tattctcaga atgacttggt 44 40 
tacggatggc atgacagtaa gagaattatg 4500 
tgcggccaac ttacttctga caacgatcgg 4560 
caacatgggg gatcatgtaa ctcgccttga 4 620 
accaaacgac gagcgtgaca ccacgatgcc 4 680 
attaactggc gaactactta ctctagcttc 4740 
ggataaagtt gcaggaccac ttctgcgctc 4800 
taaatctgga gccggtgagc gtgggtctcg 4 860 
taagccctcc cgtatcgtag ttatctacac 4 920 
aaatagacag atcgctgaga taggtgcctc 4 980 
agtttactca tatatacttt agattgattt 5040 
ggtgaagatc ctttttgata atctcatgac 5100 
ctgagcgtca gaccccgtag aaaagatcaa 5160 
cgtaatctgc tgcttgcaaa caaaaaaacc 5220 
tcaagagcta ccaactcttt ttccgaaggt 5280 
tactgtcctt ctagtgtagc cgtagttagg 534 0 
tacatacctc gctctgctaa tcctgttacc 5400 
tcttaccggg ttggactcaa gacgatagtt 54 60 
ggggggttcg tgcacacagc ccagcttgga 5520 
acagcgtgag ctatgagaaa gcgccacgct 5580 
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15 



tcccgaaggg 
cacgagggag 
cctctgactt 
cgccagcaac 
ctttcctgcg 
taccgctcgc 
gcgcccaata 
cgacaggttt 
cactcattag 
tgtgagcgga 
gcgtcgagga 
ctttagcagc 
cacatccacc 
tactcctccc 
acaaatggaa 
gaagcgggta 
tcagaggctg 
gggcgcccga 
cgcgctgttc 
20 aggatccgcg 
tctaacagtt 
tagagcactg 
actttttgat 
ttggtaactc 
25 aatagctaag 
tcgaataact 
gccgccacga 
acaaggagac 
gacgtccccc 
cacaccgtcg 
acgcgcgtcg 
gtctggacca 
atggccgagt 
ccgcaccggc 
cagggcaagg 
ggggtgcccg 
ggcttcaccg 
cgcaagcccg 
cgaccccatg 
cgaggcccac 
tgctttaaaa 
tgttgttaac 
tttcacaaat 
tgtatct 

45 



30 



35 



40 



agaaaggcgg 
cttccagggg 
gagcgtcgat 
gcggcctttt 
ttatcccctg 
cgcagccgaa 
cgcaaaccgc 
cccgactgga 
gcaccccagg 
taacaatttc 
attctaccgg 
cccgctgggc 
ggtaggcgcc 
ctagtcagga 
gtagcacgtc 
ggcctttggg 
ggaaggggtg 
aggtcctccg 
tcctcttcct 
ttaacaccta 
taaaatatcc 
gcctccggag 
tcagattaga 
tctactctat 
tttgtcaaag 
tcgtatagca 
ccggccggcc 
gaccttccat 
gggccgtacg 
acccggaccg 
ggctcgacat 
cgccggagag 
tgagcggttc 
ccaaggagcc 
gtctgggcag 
ccttcctgga 
tcaccgccga 
gtgcctgacg 
gctccgaccg 
cgactctaga 
aacctcccac 
ttgtttattg 
aaagcatttt 



acaggtatcc 
gaaacgcctg 
ttttgtgatg 
tacggttcct 
attctgtgga 
cgaccgagcg 
ctctccccgc 
aagcgggcag 
ctttacactt 
acacaggaaa 
gtaggggagg 
acttggcgct 
aaccggctcc 
agttcccccc 
tcactagtct 
gcagcggcca 
ggtccggggg 
gaggcccggc 
catctccggg 
agaaggcgaa 
gatgaggcat 
ccggaggtcc 
tttgtaaatc 
aatttttatg 
ttcttattaa 
tacattatac 
ggtgccgcca 
gaccgagtac 
caccctcgcc 
ccacatcgag 
cggcaaggtg 
cgtcgaagcg 
ccggctggcc 
cgcgtggttc 
cgccgtcgtg 
gacctccgcg 
cgtcgagtgc 
cccgccccac 
aagccgaccc 
ggatcataat 
acctccccct 
cagcttataa 
tttcactgca 



79 

ggtaagcggc 
gtatctttat 
ctcgtcaggg 
ggccttttgc 
taaccgtatt 
cagcgagtca 
gcgttggccg 
tgagcgcaac 
tatgcttccg 
cagctatgac 
cgcttttccc 
acacaagtgg 
gttctttggt 
cgccccgcag 
cgtgcagatg 
atagcagctt 
cgggctcagg 
attctgcacg 
cctttcgacc 
gttttcctta 
atttatgttg 
cgggttcaaa 
tttattacaa 
agaaattcac 
actctccatg 
gaagttataa 
ccatcccctg 
aagcccacgg 
gccgcgttcg 
cgggtcaccg 
tgggtcgcgg 
ggggcggtgt 
gcgcagcaac 
ctggccaccg 
ctccccggag 
ccccgcaacc 
ccgaaggacc 
gacccgcagc 
gggcggcccc 
cagccatacc 
gaacctgaaa 
tggttacaaa 
ttctagttgt 



agggtcggaa 
agtcctgtcg 
gggcggagcc 
tggccttttg 
accgcctttg 
gtgagcgagg 
attcattaat 
gcaattaatg 
gctcgtatgt 
catgattacg 
aaggcagtct 
cctctggctc 
ggccccttcg 
ctcgcgtcgt 
gacagcaccg 
tgctccttcg 
ggcgggctca 
cttcaaaagc 
tgcagcccgg 
caccttgcag 
gacccgtagc 
tcccggcggg 
ggataatttg 
agtcgtccct 
tagagattaa 
gcttgcatgc 
acccacgccc 
tgcgcctcgc 
ccgactaccc 
agctgcaaga 
acgacggcgc 
tcgccgagat 
agatggaagg 
tcggcgtctc 
tggaggcggc 
tccccttcta 
gcgcgacctg 
gcccgaccga 
gccgaccccg 
acatttgtag 
cataaaatga 
taaagcaata 
ggtttgtcca 



caggagagcg 
ggtttcgcca 
tatggaaaaa 
ctcacatgtt 
agtgagctga 
aagcggaaga 
gcagctggca 
tgagttagct 
tgtgtggaat 
ccaagctggc 
ggagcatgcg 
gcacacattc 
cgccaccttc 
gcaggacgtg 
ctgagcaatg 
ctttctgggc 
ggggcggggc 
gcacgtctgc 
tacagttcga 
atataaagtg 
tcagccagga 
tccgtatatt 
atcttgtata 
ttataccata 
atcggatcct 
ctgcaggtcg 
ctgacccctc 
cacccgcgac 
cgccacgcgc 
actcttcctc 
cgcggtggcg 
cggcccgcgc 
cctcctggcg 
gcccgaccac 
cgagcgcgcc 
cgagcggctc 
gtgcatgacc 
aaggagcgca 
cacccgcccc 
aggttttact 
atgcaattgt 
gcatcacaaa 
aactcatcaa 



5640 
5700 
5760 
5820 
5880 
5940. 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080- 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8167 



<210> 84 
<211> 51 
<212> DNA 
50 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer XisAl 
55 <400> 84 

ataagaatgc ggccgcccga tatgcaaaat cagggtcaag acaaatatca a 



51 



60 



65 



<210> 85 
<211> 76 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer XisA2 
<400> 85 
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ataagaatgc ggccgcacca tgcccaagaa gaagaggaag gtgcaaaatc agggtcaaqa 60 
caaatatcaa caagcc - " " 76 

5 <210> 86 

<211> 44 ■ 

<212> DNA 

<213> Artificial Sequence 
10 <220> 

<223> Description of Artificial Sequence: primer XisA3 

<400> 86 

ataagaatgc ggccgctcaa ctattcttat aagctatttc catc 44 

<210> 87 

<211> 82 

<212> DNA 

20 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer nifDl 
25 <400> 87 

cgatggctct tcccttccgt caaatgcact cttgggatta ctccgaacct agcgatgggg 60 
tgcaaatgtc agatcagata ag 82 

30 <210> 88 
<211> 82 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: primer nifD2 
<400> 88 

cgcttatctg atctgacatt tgcaccccat cgctaggttc ggagtaatcc caagagtgca 60 
40 tttgacggaa gggaagagcc at 82 

<210> 89 
<211> 74 
45 <212> DNA 

<213> Artificial Sequence 



50 



55 



60 



65 



<220> 

<223> Description of Artificial Sequence: primer nifD3 
<400> 89 

gatcagctgt tgaaagctat taaaccacaa aaaggattac tccggccctt atcacggtta 60 
cgacggattt get a 74 

<210> 90 
<211> 74 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer nifD4 
<400> 90 

gatctagcaa ateegtegta accgtgataa gggccggagt aatccttttt gtggtttaat 60 
agctttcaac agct " "~ 74 
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<210> 91 
<211> 41 
<212> DNA 
5 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primerSSVl-1 
10 <400> 91 

ataagaatgc ggccgcccga tatgacgaaa gataagacgc g 41 

<210> 92 
15 <211> 70 
<212> DNA 

<213> Artificial Sequence 



20 



25 



45 



50 



55 



60 



<220> 

<223> Description of Artificial Sequence: primer SSV1-2 
<400> 92 

ataagaatgc ggccgcacca tgcccaagaa gaagaggaag gtgacgaaag ataagacgcg 60 
ttataaatac " in 



70 



<210> 93 
<211> 47 
<212> DNA 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer SSV2 
35 <400> 93 

tgtcccgggc tcgaaaccgg ggggatccgc ttgtagggga gtatccc 47 

<210> 94 
40 <211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer SSV3 
<400> 94 

gagcccggga caagcggaag cggtggtgga aaagagggaa ctgaacg 47 

<210> 95 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer SSV4 
<400> 95 

atcgctcgag tcagacccct tttagccatt ccg 33 



<210> 96 
<211> 40 
65 <212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: primer SSV5 
<400> 96 

5 atcgttcgaa ggatccgcgt taacacctaa gaaggcgaag 40 

<210> 97 
<211> 38 
10 <212> DNA 

<213> Artificial Sequence 



15 



40 



50 



60 



65 



<220> 

<223> Description of Artificial Sequence: primer SSV6 
<400> 97 

atcgttcgaa ggatccgatt taatctctac atggagag 38 



20 <210> 98 
<211> 64 
<212> DNA 

<213> Artificial Sequence 



25 <220> 

<223> Description of Artificial Sequence: primer C31-2 
<400> 98 

ataagaatgc ggccgcacca tgcccaagaa gaagaggaag gtgapacaag gggttgtgac 60 
cggg - 64 



<210> 99 
<211> 4831 
35 <212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: vector pRK41 



<400> 99 

ggccgcacca tgcccaagaa gaagaggaag gtgacacaag gggttgtgac cggggtggac 60 

acgtacgcgg gtgcttacga ccgtcagtcg cgcgagcgcg agaattcgag cgcagcaagc 120 

ccagcgacac agcgtagcgc caacgaagac aaggcggccg accttcagcg cgaagtcgag 180 

4!) cgcgacgggg gccggttcag gttcgtcggg catttcagcg aagcgccggg cacgtcggcg 240 

ttcgggacgg cggagcgccc ggagttcgaa cgcatcctga acgaatgccg cgccgggcgg 300 

ctcaacatga tcattgtcta tgacgtgtcg cgcttctcgc gcctgaaggt catggacgcg 360 

attccgattg tctcggaatt gctcgccctg ggcgtgacga ttgtttccac tcaggaaggc 420 

gtcttccggc agggaaacgt catggacctg attcacctga ttatgcggct cgacgcgtcg 4 80 

cacaaagaat cttcgctgaa gtcggcgaag attctcgaca cgaagaacct tcagcgcgaa 540 

ttgggcgggt acgtcggcgg gaaggcgcct tacggcttcg agcttgtttc ggagacgaag 600 

gagatcacgc gcaacggccg aatggtcaat gtcgtcatca acaagcttgc gcactcgacc 660 

actcccctta ccggaccctt cgagttcgag cccgacgtaa tccggtggtg gtggcgtgag 720 

atcaagacgc acaaacacct tcccttcaag ccgggcagtc aagccgccat tcacccgggc 780 

53 agcatcacgg ggctttgtaa gcgcatggac gctgacgccg tgccgacccg gggcgagacg 840 

attgggaaga agaccgcttc aagcgcctgg gacccggcaa ccgttatgcg aatccttcgg 900 

gacccgcgta ttgcgggctt cgccgctgag gtgatctaca agaagaagcc ggacggcacg 960 

ccgaccacga agattgaggg ttaccgcatt cagcgcgacc cgatcacgct ccggccggtc 1020 

gagcttgatt gcggaccgat catcgagccc gctgagtggt atgagcttca ggcgtggttg 1080 

gacggcaggg ggcgcggcaa ggggctttcc cgggggcaag ccattctgtc cgccatggac 1140 

aagctgtact gcgagtgtgg cgccgtcatg acttcgaagc gcggggaaga atcgatcaag 1200 

gactcttacc gctgccgtcg ccggaaggtg gtcgacccgt ccgcacctgg gcagcacgaa 1260 

ggcacgtgca acgtcagcat ggcggcactc gacaagttcg ttgcggaacg catcttcaac 1320 

aagatcaggc acgccgaagg cgacgaagag acgttggcgc' ttctgtggga agccgcccga 1380 

cgcttcggca agctcactga ggcgcctgag aagagcggcg aacgggcgaa ccttgttgcg 1440 

gagcgcgccg acgccctgaa cgcccttgaa gagctgtacg aagaccgcgc ggcaggcgcg 1500 

tacgacggac ccgttggcag gaagcacttc cggaagcaac aggcagcgct gacgctccgg 1560 
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cagcaagggg 
cttgaccaat 
gggcgcgcgt 
gtcacgaagt 
5 acgtgggcga 
gtagcggcgt 
caagcttatc 
agtgagggtt 
gttatccgct 

10 gtgcctaatg 
cgggaaacct 
tgcgtattgg 
tgcggcgagc 
ataacgcagg 

15 ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
tgtaggtcgt 

20 gcgccttatc 
tggcagcagc 
tcttgaagtg 
tgctgaagcc 
ccgctggtag 

25 ctcaagaaga 
gttaagggat 
aaaaatgaag 
aatgcttaat 
cctgactccc 

30 ctgcaatgat 
cagccggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 

35 gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 

40 cgatgtaacc 
ctgggtgagc 
aatgttgaat 
gtctcatgag 
gcacatttcc 

45 gttaaatttt 
ttataaatca 
tccactatta 
tggcccacta 
actaaatcgg 

50 cgtggcgaga 
agcggtcacg 
gtcccattcg 
gctattacgc 
agggttttcc 

55 atagggcgaa 



cggaagagcg 
ggttccccga 
cagtagacga 
cgactacggg 
agccgccgac 
agcggccgct 
gataccgtcg 
aatttcgagc 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgtgcgctct 
gggaagcgtg 
tcgctccaag 
cggtaactat 
cactggtaac 
gtggcctaac 
agttaccttc 
cggtggtttt 
tcctttgatc 
tttggtcatg 
ttttaaatca 
cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggccgagcgc 
ccgggaagct 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgcataat 
ctcaaccaag 
aatacgggat 
ttcttcgggg 
cactcgtgca 
aaaaacagga 
actcatactc 
cggatacata 
ccgaaaagtg 
tgttaaatca 
aaagaataga 
aagaacgtgg 
cgtgaaccat 
aaccctaaag 
aaggaaggga 
ctgcgcgtaa 
ccattcaggc 
cagctggcga 
cagtcacgac 
ttggagctcc 



gcttgccgaa 
agacgccgac 
caagcgcgtg 
cagggggcag 
cgacgacgac 
ctagaactag 
acctcgaggg 
ttggcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
ctgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 
tttgtttgca 
ttttctacgg 
agattatcaa 
atctaaagta 
cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaactgat 
aggcaaaatg 
ttcctttttc 
tttgaatgta 
ccacctaaat 
gctcattttt 
ccgagatagg 
actccaacgt 
caccctaatc 
ggagcccccg 
agaaagcgaa 
ccaccacacc 
tgcgcaactg 
aagggggatg 
gttgtaaaac 
accgcggtgg 
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cttgaagccg 
gctgacccga 
ttcgtcgggc 
ggaacgccca 
gaagacgacg 
tggatccccc 
ggggcccggt 
catggtcata 
gagccggaag 
ttgcgttgcg 
gaatcggcca 
tcactgactc 
cggtaatacg 
gccagcaaaa 
gcccccctga 
gactataaag 
ccctgccgct 
atagctcacg 
tgcacgaacc 
ccaacccggt 
gagcgaggta 
ctagaaggac 
ttggtagctc 
agcagcagat 
ggtctgacgc 
aaaggatctt 
tatatgagta 
cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gttcgccagt 
gctcgtcgtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 
ccgcaaaaaa 
aatattattg 
tttagaaaaa 
tgtaagcgtt 
taaccaatag 
gttgagtgtt 
caaagggcga 
aagttttttg 
atttagagct 
aggagcgggc 
cgccgcgctt 
ttgggaaggg 
tgctgcaagg 
gacggccagt 



ccgaagcccc 
ccggccctaa 
tcttcgtaga 
tcgagaagcg 
cccaggacgg 
gggctgcagg 
acccagcttt 
gctgtttcct 
cataaagtgt 
ctcactgccc 
acgcgcgggg 
gctgcgctcg 
gttatccaca 
ggccaggaac 
cgagcatcac 
ataccaggcg 
taccggatac 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtatttggt 
ttgatccggc 
tacgcgcaga 
tcagtggaac 
cacctagatc 
aacttggtct 
atttcgttca 
cttaccatct 
tttatcagca 
atccgcctcc 
taatagtttg 
tggtatggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 
gggaataagg 
aagcatttat 
taaacaaata 
aatattttgt 
gccgaaatcg 
gttccagttt 
aaaaccgtct 
gggtcgaggt 
tgacggggaa 
gctagggcgc 
aatgcgccgc 
cgatcggtgc 
cgattaagtt 
gaattgtaat 



gaagcttccc 
gtcgtggtgg 
caagatcgtt 
cgcttcgatc 
cacggaagac 
aattcgatat 
tgttcccttt 
gtgtgaaatt 
aaagcctggg 
gctttccagt 
agaggcggtt 
gtcgttcggc 
gaatcagggg 
cgtaaaaagg 
aaaaatcgac 
tttccccctg 
ctgtccgcct 
ctcagttcgg 
cccgaccgct 
ttatcgccac 
gctacagagt 
atctgcgctc 
aaacaaacca 
aaaaaaggat 
gaaaactcac 
cttttaaatt 
gacagttacc 
tccatagttg 
ggccccagtg 
ataaaccagc 
atccagtcta 
cgcaacgttg 
tcattcagct 
aaagcggtta 
tcactcatgg 
ttttctgtga 
agttgctctt 
gtgctcatca 
agatccagtt 
accagcgttt 
gcgacacgga 
cagggttatt 
ggggttccgc 
taaaattcgc 
gcaaaatccc 
ggaacaagag 
atcagggcga 
gccgtaaagc 
agccggcgaa 
tggcaagtgt 
tacagggcgc 
gggcctcttc 
gggtaacgcc 
acgactcact 



1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4831 



60 



65 



<210> 100 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
C31-screen 1 



<400> 100 
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<210> 101 
5 <211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

10 <223> Description of Artificial Sequence: primer 
C31-screen 2 



15 



25 



<400> 101 

gcagcggtaa gagtccttga t 21 



<210> 102 
<211> 20 
<212> DNA 
20 <213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
beta-Gal 3 

<400> 102 

atcctctgca tggtcaggtc 20 



30 <210> 103 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: primer 
beta-Gal A 

<400> 103 f 
40 cgtggcctga ttcattcc 18 

<210> 104 
<211> 5878 
45 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
50 pCAGGS-Cre-pA 

<400> 104 

cgccgcgtgc ggcccgcgct gcccggcggc tgtgagcgct gcgggcgcgg cgcggggctt 60 
tgtgcgctcc gcgtgtgcgc gaggggagcg cggccggggg cggtgccccg cggtgcgggg 120 

55 gggctgcgag gggaacaaag gctgcgtgcg gggtgtgtgc gtgggggggt gagcaggggg 180 
tgtgggcgcg gcggtcgggc tgtaaccccc ccctgcaccc ccctccccga gttgctgagc 240 
acggcccggc ttcgggtgcg gggctccgtg cggggcgtgg cgcggggctc gccgtgccgg 300 
gcggggggtg gcggcaggtg ggggtgccgg gcggggcggg gccgcctcgg gccggggagg 360 
gctcggggga ggggcgcggc ggccccggag cgccggcggc tgtcgaggcg cggcgagccg 420 

60 cagccattgc cttttatggt aatcgtgcga gagggcgcag ggacttcctt tgtcccaaat 480 
ctggcggagc cgaaatctgg gaggcgccgc cgcaccccct ctagcgggcg cgggcgaagc 540 
ggtgcggcgc cggcaggaag gaaatgggcg gggagggcct tcgtgcgtcg ccgcgccgcc 600 
gtccccttct ccatctccag cctcggggct gccgcagggg gacggctgcc ttcggggggg 660 
acggggcagg gcggggttcg gcttctggcg tgtgaccggc ggctctagta agcgttgggg 720 

65 tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 
cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
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gccatacact tgagtgacat tgacatccac 
cagggcggcc tcgaccatgc ccaagaagaa 
ccaaaatttg cctgcattac cggtcgatgc 
ggacatgttc agggatcgcc aggcgttttc 
5 ttgccggtcg tgggcggcat ggtgcaagtt 
tgaagatgtt cgcgattatc ttctatatct 
ccagcaacat ttgggccagc taaacatgct 
tgacagcaat gctgtttcac tggttatgcg 
tgaacgtgca aaacaggctc tagcgttcga 

10 catggaaaat agcgatcgct gccaggatat 
taacaccctg ttacgtatag ccgaaattgc 
tgacggtggg agaatgttaa tccatattgg 
tgtagagaag gcacttagcc tgggggtaac 
tggtgtagct gatgatccga ataactacct 

15 cgcgccatct gccaccagcc agctatcaac 
tcatcgattg atttacggcg ctaaggatga 
acacagtgcc cgtgtcggag ccgcgcgaga 
gatcatgcaa gctggtggct ggaccaatgt 
ggatagtgaa acaggggcaa tggtgcgcct 

20 taaatgattg cagatccact agttctaggg 
caataaaaga tcattatttt caatagatct 
gggggaggcc agaatgaggc gcggccaagg 
agggggaggc cagaatgacc ttgggggagg 
accgagctcg aattcactgg ccgtcgtttt 

25 tacccaactt aatcgccttg cagcacatcc 
ggcccgcacc gatcgccctt cccaacagtt 
gcggtatttt ctccttacgc atctgtgcgg 
tacaatctgc tctgatgccg catagttaag 
cgcgccctga cgggcttgtc tgctcccggc 

30 cgggagctgc atgtgtcaga ggttttcacc 
cctcgtgata cgcctatttt tataggttaa 
aggtggcact tttcggggaa atgtgcgcgg 
ttcaaatatg tatccgctca tgagacaata 
aaggaagagt atgagtattc aacatttccg 

35 ttgccttcct gtttttgctc acccagaaac 
gttgggtgca cgagtgggtt acatcgaact 
ttttcgcccc gaagaacgtt ttccaatgat 
ggtattatcc cgtattgacg ccgggcaaga 
gaatgacttg gttgagtact caccagtcac 

40 aagagaatta tgcagtgctg ccataaccat 
gacaacgatc ggaggaccga aggagctaac 
aactcgcctt gatcgttggg aaccggagct 
caccacgatg cctgtagcaa tggcaacaac 
tactctagct tcccggcaac aattaataga 

45 acttctgcgc tcggcccttc cggctggctg 
gcgtgggtct cgcggtatca ttgcagcact 
agttatctac acgacgggga gtcaggcaac 
gataggtgcc tcactgatta agcattggta 
ttagattgat ttaaaacttc atttttaatt 

50 taatctcatg accaaaatcc cttaacgtga 
agaaaagatc aaaggatctt cttgagatcc 
aacaaaaaaa ccaccgctac cagcggtggt 
ttttccgaag gtaactggct tcagcagagc 
gccgtagtta ggccaccact tcaagaactc 

55 aatcctgtta ccagtggctg ctgccagtgg 
aagacgatag ttaccggata aggcgcagcg 
gcccagcttg gagcgaacga cctacaccga 
aagcgccacg cttcccgaag ggagaaaggc 
aacaggagag cgcacgaggg agcttccagg 

60 cgggtttcgc cacctctgac ttgagcgtcg 
cctatggaaa aacgccagca acgcggcctt 
tgctcacatg ttctttcctg cgttatcccc 
tgagtgagct gataccgctc gccgcagccg 
ggaagcggaa gagcgcccaa tacgcaaacc 

65 atgcagctgg cacgacaggt ttcccgactg 
tgtgagttag ctcactcatt aggcacccca 
gttgtgtgga attgtgagcg gataacaatt 
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tttgcctttc tctccacagg tgtccactcc 960 
gaggaaggtg tccaatttac tgaccgtaca 1020 
aacgagtgat gaggttcgca agaacctgat 1080 
tgagcatacc tggaaaatgc ttctgtccgt 1140 
gaataaccgg aaatggtttc ccgcagaacc 1200 
tcaggcgcgc ggtctggcag taaaaactat 1260 
tcatcgtcgg tccgggctgc cacgaccaag 1320 
gcggatccga aaagaaaacg ttgatgccgg 1380 
acgcactgat ttcgaccagg ttcgttcact 1440 
acgtaatctg gcatttctgg ggattgctta 1500 
caggatcagg gttaaagata tctcacgtac 1560 
cagaacgaaa acgctggtta gcaccgcagg 1620 
taaactggtc gagcgatgga tttccgtctc 1680 
gttttgccgg gtcagaaaaa atggtgttgc 1740 
tcgcgccctg gaagggattt ttgaagcaac 1800. 
ctctggtcag agatacctgg cctggtctgg 1860 
tatggcccgc gctggagttt caataccgga 1920 
aaatattgtc atgaactata tccgtaacct 1980 
gctggaagat ggcgattagc cattaacgcg 2040 
ccgcgtcgac ctcgagatcc aggcgcggat 2100 
gtgtgttggt tttttgtgtg ccttggggga 2160 
gggaggggga ggccagaatg accttggggg 2220 
gggaggccag aatgaggcgc gcccccgggt 2280 
acaacgtcgt gactgggaaa accctggcgt 2340 
ccctttcgcc agctggcgta atagcgaaga 2400 
gcgcagcctg aatggcgaat ggcgcctgat 24 60 
tatttcacac cgcatatggt gcactctcag 2520 
ccagccccga cacccgccaa cacccgctga 2580 
atccgcttac agacaagctg tgaccgtctc 2640 
gtcatcaccg aaacgcgcga gacgaaaggg 2700 
tgtcatgata ataatggttt cttagacgtc 27 60 
aacccctatt tgtttatttt tctaaataca 2820 
accctgataa atgcttcaat aatattgaaa 2880 
tgtcgccctt attccctttt ttgcggcatt 2940 
gctggtgaaa gtaaaagatg ctgaagatca 3000 
ggatctcaac agcggtaaga tccttgagag 3060 
gagcactttt aaagttctgc tatgtggcgc 3120 
gcaactcggt cgccgcatac actattctca 3180 
agaaaagcat cttacggatg gcatgacagt 3240 
gagtgataac actgcggcca acttacttct 3300 
cgcttttttg cacaacatgg gggatcatgt 3360 
gaatgaagcc ataccaaacg acgagcgtga 3420 
gttgcgcaaa ctattaactg gcgaactact 3480 
ctggatggag gcggataaag ttgcaggacc 3540 
gtttattgct gataaatctg gagccggtga 3600 
ggggccagat ggtaagccct cccgtatcgt 3660 
tatggatgaa cgaaatagac agatcgctga 3720 
actgtcagac caagtttact catatatact 3780 
taaaaggatc taggtgaaga tcctttttga 3840 
gttttcgttc cactgagcgt cagaccccgt 3900 
tttttttctg cgcgtaatct gctgcttgca 3960 
ttgtttgccg gatcaagagc taccaactct 4020 
gcagatacca aatactgtcc ttctagtgta 4080 
tgtagcaccg cctacatacc tcgctctgct 4140 
cgataagtcg tgtcttaccg ggttggactc 4200 
gtcgggctga acggggggtt cgtgcacaca 4260 
actgagatac ctacagcgtg agctatgaga 4320 
ggacaggtat ccggtaagcg gcagggtcgg 4380 
gggaaacgcc tggtatcttt atagtcctgt 4440 
atttttgtga tgctcgtcag gggggcggag 4500 
tttacggttc ctggcctttt gctggccttt 4560 
tgattctgtg gataaccgta ttaccgcctt 4620 
aacgaccgag cgcagcgagt cagtgagcga 4680 
gcctctcccc gcgcgttggc cgattcatta 47 4 0 
gaaagcgggc agtgagcgca acgcaattaa 4800 
ggctttacac tttatgcttc cggctcgtat 4860 
tcacacagga aacagctatg accatgatta 4920 
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15 



cgccaagcta 
gttattaata 
ttacataact 
cgtcaataat 
gggtggacta 
gtacgccccc 
tgaccttatg 
tgggtcgagg 
ccaattttgt 
ggggcgcgcg 
tgcggcggca 
gcggcggcgg 
gccccgtgcc 
actcccacag 
ttaatgacgg 
ccctttgtgc 



gcccgggcta 
gtaatcaatt 
tacggtaaat 
gacgtatgtt 
tttacggtaa 
tattgacgtc 
ggactttcct 
tgagccccac 
atttatttat 
ccaggcgggg 
gccaatcaga 
ccctataaaa 
ccgctccgcg 
gtgagcgggc 
ctcgtttctt 

gggggggagc 



gcttgcatgc 
acggggtcat 
ggcccgcctg 
cccatagtaa 
actgcccact 
aatgacggta 
acttggcagt 
gttctgcttc 
tttttaatta 
cggggcgggg 
gcggcgcgct 
agcgaagcgc 
ccgcctcgcg 
gggacggccc 
ttctgtggct 
ggctcggggg 



86 

ctgcaggttt 
tagttcatag 
gctgaccgcc 
cgccaatagg 
tggcagtaca 
aatggcccgc 
acatctacgt 
actctcccca 
ttttgtgcag 
cgaggggcgg 
ccgaaagttt 
gcggcgggcg 
ccgcccgccc 
ttctcctccg 
gcgtgaaagc 
gtgcgtgcgt 



tcgacattga 
cccatatatg 
caacgacccc 
gactttccat 
tcaagtgtat 
ctggcattat 
attagtcatc 
tctccccccc 
cgatgggggc 
ggcggggcga 
ccttttatgg 
ggagtcgctg 
cggctctgac 
ggctgtaatt 
cttaaagggc 
gtgtgtgtgc 



ttattgacta 
gagttccgcg 
cgcccattga 
tgacgtcaat 
catatgccaa 
gcccagtaca 
gctattacca 
ctccccaccc 

gggggggggg 
ggcggagagg 
cgaggcggcg 
cgttgccttc 
tgaccgcgtt 
agcgcttggt 
tccgggaggg 
gtggggag 



4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5878 



<210> 105 
20 <211> 6641 
<212> DNA 

<213> Artificial Sequence 
<220> 

25 <223> Description of Artificial Sequence: vector 
pCAGGSC31CNLS-pA 



<400> 105 
attgattatt 

30 atatggagtt 
acccccgccc 
tccattgacg 
tgtatcatat 
attatgccca 

35 tcatcgctat 
cccccctccc 
ggggcggggg 
ggcgaggcgg 
tatggcgagg 

40 cgctgcgttg 
ctgactgacc 
taattagcgc 
agggctccgg 
tgtgcgtggg 

45 cggcgcgggg 
ccgcggtgcg 
ggtgagcagg 
cgagttgctg 
ctcgccgtgc 

50 cgggccgggg 
gcgcggcgag 
ctttgtccca 
gcgcgggcga 
tcgccgcgcc 

55 gccttcgggg 
gtaagcgttg 
cagtttccaa 
tggccgcgtc 
ggcttgagat 

60 aggtgtccac 
acgtacgcgg 
ccagcgacac 
cgcgacgggg 
ttcgggacgg 

65 ctcaacatga 
attccgattg 
gtcttccggc 



gactagttat 
ccgcgttaca 
attgacgtca 
tcaatgggtg 
gccaagtacg 
gtacatgacc 
taccatgggt 
cacccccaat 

gggggggggc 
agaggtgcgg 
cggcggcggc 
ccttcgcccc 
gcgttactcc 
ttggtttaat 
gagggccctt 
gagcgccgcg 
ctttgtgcgc 
ggggggctgc 
gggtgtgggc 
agcacggccc 
cgggcggggg 
agggctcggg 
ccgcagccat 
aatctggcgg 
agcggtgcgg 
gccgtcccct 
gggacggggc 
gggtgagtac 
aaacgaggag 
catctggtca 
ctggccatac 
tcccagggcg 
gtgcttacga 
agcgtagcgc 
gccggttcag 
cggagcgccc 
tcattgtcta 
tctcggaatt 
agggaaacgt 



taatagtaat 
taacttacgg 
ataatgacgt 
gactatttac 
ccccctattg 
ttatgggact 
cgaggtgagc 
tttgtattta 
gcgcgccagg 
cggcagccaa 
ggcggcccta 
gtgccccgct 
cacaggtgag 
gacggctcgt 
tgtgcggggg 
tgcggcccgc 
tccgcgtgtg 
gaggggaaca 
gcggcggtcg 
ggcttcgggt 
gtggcggcag 
ggaggggcgc 
tgccttttat 
agccgaaatc 
cgccggcagg 
tctccatctc 
agggcggggt 
tccctctcaa 
gatttgatat 
gaaaagacaa 
acttgagtga 
gccgcccgat 
ccgtcagtcg 
caacgaagac 
gttcgtcggg 
ggagttcgaa 
tgacgtgtcg 
gctcgccctg 
catggacctg 



caattacggg 
taaatggccc 
atgttcccat 
ggtaaactgc 
acgtcaatga 
ttcctacttg 
cccacgttct 
tttatttttt 
cggggcgggg 
tcagagcggc 
taaaaagcga 
ccgcgccgcc 
cgggcgggac 
ttcttttctg 
ggagcggctc 
gctgcccggc 
cgcgagggga 
aaggctgcgt 
ggctgtaacc 
gcggggctcc 
gtgggggtgc 
ggcggccccg 
ggtaatcgtg 
tgggaggcgc 
aaggaaatgg 
cagcctcggg 
tcggcttctg 
aagcgggcat 
tcacctggcc 
tctttttgtt 
cattgacatc 
atgacacaag 
cgcgagcgcg 
aaggcggccg 
catttcagcg 
cgcatcctga 
cgcttctcgc 
ggcgtgacga 
attcacctga 



gtcattagtt 
gcctggctga 
agtaacgcca 
ccacttggca 
cggtaaatgg 
gcagtacatc 
gcttcactct 
aattattttg 
cggggcgagg 
gcgctccgaa 
agcgcgcggc 
tcgcgccgcc 
ggcccttctc 
tggctgcgtg 
ggggggtgcg 
ggctgtgagc 
gcgcggccgg 
gcggggtgtg 
cccccctgca 
gtgcggggcg 
cgggcggggc 
gagcgccggc 
cgagagggcg 
cgccgcaccc 
gcggggaggg 
gctgccgcag 
gcgtgtgacc 
gacttctgcg 
cgcggtgatg 
gtcaagcttg 
cactttgcct 
gggttgtgac 
agaattcgag 
accttcagcg 
aagcgccggg 
acgaatgccg 
gcctgaaggt 
ttgtttccac 
ttatgcggct 



catagcccat 
ccgcccaacg 
atagggactt 
gtacatcaag 
cccgcctggc 
tacgtattag 
ccccatctcc 
tgcagcgatg 
ggcggggcgg 
agtttccttt 
gggcgggagt 
cgccccggct 
ctccgggctg 
aaagccttaa 
tgcgtgtgtg 
gctgcgggcg 
gggcggtgcc 
tgcgtggggg 
cccccctccc 
tggcgcgggg 
ggggccgcct 
ggctgtcgag 
cagggacttc 
cctctagcgg 
ccttcgtgcg 
ggggacggct 
ggcggctcta 
ctaagattgt 
cctttgaggg 
aggtgtggca 
ttctctccac 
cggggtggac 
cgcagcaagc 
cgaagtcgag 
cacgtcggcg 
cgccgggcgg 
catggacgcg 
tcaggaaggc 
cgacgcgtcg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



cacaaagaat 

ttgggcgggt 

gagatcacgc 

actcccctta 

atcaagacgc 

agcatcacgg 

attgggaaga 

gacccgcgta 

ccgaccacga 

gagcttgatt 

gacggcaggg 

aagctgtact 

gactcttacc 

ggcacgtgca 

aagatcaggc 

cgcttcggca 

gagcgcgccg 

tacgacggac 

cagcaagggg 

cttgaccaat 

gggcgcgcgt 

gtcacgaagt 

acgtgggcga 

gtagcggcgc 

aaaagatcat 

gaggccagaa 

ggaggccaga 

agctcgaatt 

caacttaatc 

cgcaccgatc 

tattttctcc 

atctgctctg 

ccctgacggg 

agctgcatgt 

gtgatacgcc 

ggcacttttc 

aatatgtatc 

aagagtatga 

cttcctgttt 

ggtgcacgag 

cgccccgaag 

ttatcccgta 

gacttggttg 

gaattatgca 

acgatcggag 

cgccttgatc 

acgatgcctg 

ctagcttccc 

ctgcgctcgg 

gggtctcgcg 

atctacacga 

ggtgcctcac 

attgatttaa 

ctcatgacca 

aagatcaaag 

aaaaaaccac 

ccgaaggtaa 

tagttaggcc 

ctgttaccag 

cgatagttac 

agcttggagc 

gccacgcttc 

ggagagcgca 

tttcgccacc 

tggaaaaacg 

cacatgttct 

tgagctgata 



cttcgctgaa 
acgtcggcgg 
gcaacggccg 
ccggaccctt 
acaaacacct 
ggctttgtaa 
agaccgcttc 
ttgcgggctt 
agattgaggg 
gcggaccgat 
ggcgcggcaa 
gcgagtgtgg 
gctgccgtcg 
acgtcagcat 
acgccgaagg 
agctcactga 
acgccctgaa 
ccgttggcag 
cggaagagcg 
ggttccccga 
cagtagacga 
cgactacggg 
agccgccgac 
ctaagaagaa 
tattttcaat 
tgaggcgcgg 
atgaccttgg 
cactggccgt 
gccttgcagc 
gcccttccca 
ttacgcatct 
atgccgcata 
cttgtctgct 
gtcagaggtt 
tatttttata 
ggggaaatgt 
cgctcatgag 
gtattcaaca 
ttgctcaccc 
tgggttacat 
aacgttttcc 
ttgacgccgg 
agtactcacc 
gtgctgccat 
gaccgaagga 
gttgggaacc 
tagcaatggc 
ggcaacaatt 
cccttccggc 
gtatcattgc 
cggggagtca 
tgattaagca 
aacttcattt 
aaatccctta 
gatcttcttg 
cgctaccagc 
ctggcttcag 
accacttcaa 
tggctgctgc 
cggataaggc 
gaacgaccta 
ccgaagggag 
cgagggagct 
tctgacttga 
ccagcaacgc 
ttcctgcgtt 
ccgctcgccg 



87 

gtcggcgaag attctcgaca 
gaaggcgcct tacggcttcg 
aatggtcaat gtcgtcatca 
cgagttcgag cccgacgtaa 
tcccttcaag ccgggcagtc 
gcgcatggac gctgacgccg 
aagcgcctgg gacccggcaa 
cgccgctgag gtgatctaca 
ttaccgcatt cagcgcgacc 
catcgagccc gctgagtggt 
ggggctttcc cgggggcaag 
cgccgtcatg acttcgaagc 
ccggaaggtg gtcgacccgt 
ggcggcactc gacaagttcg 
cgacgaagag acgttggcgc 
ggcgcctgag aagagcggcg 
cgcccttgaa gagctgtacg 
gaagcacttc cggaagcaac 
gcttgccgaa cttgaagccg 
agacgccgac gctgacccga 
caagcgcgtg ttcgtcgggc 
cagggggcag ggaacgccca 
cgacgacgac gaagacgacg 
gaggaaggtt tagactctcg 
agatctgtgt gttggttttt 
ccaaggggga gggggaggcc 
gggaggggga ggccagaatg 
cgttttacaa cgtcgtgact 
acatccccct ttcgccagct 
acagttgcgc agcctgaatg 
gtgcggtatt tcacaccgca 
gttaagccag ccccgacacc 
cccggcatcc gcttacagac 
ttcaccgtca tcaccgaaac 
ggttaatgtc atgataataa 
gcgcggaacc cctatttgtt 
acaataaccc tgataaatgc 
tttccgtgtc gcccttattc 
agaaacgctg gtgaaagtaa 
cgaactggat ctcaacagcg 
aatgatgagc acttttaaag 
gcaagagcaa ctcggtcgcc 
agtcacagaa aagcatctta 
aaccatgagjt gataacactg 
gctaaccgct tttttgcaca 
ggagctgaat gaagccatac 
aacaacgttg cgcaaactat 
aatagactgg atggaggcgg 
tggctggttt attgctgata 
agcactgggg ccagatggta 
ggcaactatg gatgaacgaa 
ttggtaactg tcagaccaag 
ttaatttaaa aggatctagg 
acgtgagttt tcgttccact 
agatcctttt tttctgcgcg 
ggtggtttgt ttgccggatc 
cagagcgcag ataccaaata 
gaactctgta gcaccgccta 
cagtggcgat aagtcgtgtc 
gcagcggtcg ggctgaacgg 
caccgaactg agatacctac 
aaaggcggac aggtatccgg 
tccaggggga aacgcctggt 
gcgtcgattt ttgtgatgct 
ggccttttta cggttcctgg 
atcccctgat tctgtggata 
cagccgaacg accgagcgca 



cgaagaacct 
agcttgtttc 
acaagcttgc 
tccggtggtg 
aagccgccat 
tgccgacccg 
ccgttatgcg 
agaagaagcc 
cgatcacgct 
atgagcttca 
ccattctgtc 
gcggggaaga 
ccgcacctgg 
ttgcggaacg 
ttctgtggga 
aacgggcgaa 
aagaccgcgc 
aggcagcgct 
ccgaagcccc 
ccggccctaa 
tcttcgtaga 
tcgagaagcg 
cccaggacgg 
agatccaggc 
tgtgtgcctt 
agaatgacct 
aggcgcgccc 
gggaaaaccc 
ggcgtaatag 
gcgaatggcg 
tatggtgcac 
cgccaacacc 
aagctgtgac 
gcgcgagacg 
tggtttctta 
tatttttcta 
ttcaataata 
ccttttttgc 
aagatgctga 
gtaagatcct 
ttctgctatg 
gcatacacta 
cggatggcat 
cggccaactt 
acatggggga 
caaacgacga 
taactggcga 
ataaagttgc 
aatctggagc 
agccctcccg 
atagacagat 
tttactcata 
tgaagatcct 
gagcgtcaga 
taatctgctg 
aagagctacc 
ctgtccttct 
catacctcgc 
ttaccgggtt 
ggggttcgtg 
agcgtgagct 
taagcggcag 
atctttatag 
cgtcaggggg 
ccttttgctg 
accgtattac 
gcgagtcagt 



PCT/EPOl/12975 

tcagcgcgaa 2400 
ggagacgaag 24 60 
gcactcgacc 2520 
gtggcgtgag 2580 
tcacccgggc 2640 
gggcgagacg 2700 
aatccttcgg 27 60 
ggacggcacg 2820 
ccggccggtc 2880 
ggcgtggttg 2940 
cgccatggac 3000 
atcgatcaag 3060 
gcagcacgaa 3120 
catcttcaac 3180. 
agccgcccga 3240 
ccttgttgcg 3300 
ggcaggcgcg 3360 
gacgctccgg 3420 
gaagcttccc 3480 
gtcgtggtgg 3540 
caagatcgtt 3600 
cgcttcgatc 3660 
cacggaagac 3720 
gcggatcaat 3780 
gggggagggg 3840 
tgggggaggg 3900 
ccgggtaccg 3960 
tggcgttacc 4020 
cgaagaggcc 4080 
cctgatgcgg 4140 
tctcagtaca 4200 
cgctgacgcg 4260 
cgtctccggg 4320 
aaagggcctc 4380 
gacgtcaggt 4 4 40 
aatacattca 4500 
ttgaaaaagg 4560 
ggcattttgc 4 620 
agatcagttg 4 680 
tgagagtttt 4740 
tggcgcggta 4 800 
ttctcagaat 4860 
gacagtaaga 4 920 
acttctgaca 4 980 
tcatgtaact 5040 
gcgtgacacc 5100 
actacttact 5160 
aggaccactt 5220 
cggtgagcgt 5280 
tatcgtagtt 5340 
cgctgagata 5400 
tatactttag 54 60 
ttttgataat 5520 
ccccgtagaa 5580 
cttgcaaaca 5640 
aactcttttt 5700 
agtgtagccg 5760 
tctgctaatc 5820 
ggactcaaga 5880 
cacacagccc 5940 
atgagaaagc 6000 
ggtcggaaca 6060 
tcctgtcggg 6120 
gcggagccta 6180 
gccttttgct 6240 
cgcctttgag 6300 
gagcgaggaa 6360 
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gcggaagagc gcccaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc 6420 

agctggcacg acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgtg 6480 

agttagctca ctcattaggc accccaggct ttacacttta tgcttccggc tcgtatgttg 6540 

tgtggaattg tgagcggata acaatttcac acaggaaaca gctatgacca tgattacgcc 6600 

o aagctagccc gggctagctt gcatgcctgc aggttttcga c ~ 6641 

<210> 106 
<211> 11784 
10 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: modified 
15 ROSA26 locus 

<400> 106 

ggcaggccct ccgagcgtgg tggagccgtt ctgtgagaca gccgggtacg agtcgtgacg 60 
ctggaagggg caagcgggtg gtgggcagga atgcggtccg ccctgcagca accggagggg 120 

20 gagggagaag ggagcggaaa agtctccacc ggacgcggcc atggctcggg gggggggggg 180 

cagcggagga gcgcttccgg ccgacgtctc gtcgctgatt ggcttctttt cctcccgccg 240 

tgtgtgaaaa cacaaatggc gtgttttggt tggcgtaagg cgcctgtcag ttaacggcag 300 

ccggagtgcg cagccgccgg cagcctcgct ctgcccactg ggtggggcgg gaggtaggtg 360 

gggtgaggcg agctggacgt gcgggcgcgg tcggcctctg gcggggcggg ggaggggagg 420 

ZD gagggtcagc gaaagtagct cgcgcgcgag cggccgccca ccctcccctt cctctggggg 480 

agtcgtttta cccgccgccg gccgggcctc gtcgtctgat tggctctcgg ggcccagaaa 540 

actggccctt gccattggct cgtgttcgtg caagttgagt ccatccgccg gccagcgggg 600 

gcggcgagga ggcgctccca ggttccggcc ctcccctcgg ccccgcgccg cagagtctgg 660 

ccgcgcgccc ctgcgcaacg tggcaggaag cgcgcgctgg gggcggggac gggcagtagg 720 

30 gctgagcggc tgcggggcgg gtgcaagcac gtttccgact tgagttgcct caagaggggc 780 

gtgctgagcc agacctccat cgcgcactcc ggggagtgga gggaaggagc gagggctcag 840 

ttgggctgtt ttggaggcag gaagcacttg ctctcccaaa gtcgctctga gttgttatca 900 

gtaagggagc tgcagtggag taggcgggga gaaggccgca cccttctccg gaggggggag 960 

gggagtgttg caataccttt ctgggagttc tctgctgcct cctggcttct gaggaccgcc 1020 

35 ctgggcctgg gagaatccct tccccctctt ccctcgtgat ctgcaactcc agtctttcgc 1080 

ctaggtaacc gatatccctg caggggtgac ctgcacgtct agggcgcagt .agtccagggt 1140 

ttccttgatg atgtcatact tatcctgtcc cttttttttc cacagctcgc ggttgaggac 1200 

aaactcttcg cggtctttcc agtactcctg caggtgactg actgagtcga cgacactgca 1260 

gagacctact tcactaacaa ccggtacagt tcgtggacca gatgggtgag gtggagtacg 1320 

cgcccgggga gcccaagggc acgccctggc acccgcaccg cggcttcgag accgtcacga 1380 

ataacttcgt atagcataca ttatacgaag ttataagctc gatgaattct accgggtagg 1440 

ggaggcgctt ttcccaaggc agtctggagc atgcgcttta gcagccccgc tggcacttgg 1500 

cgctacacaa gtggcctctg gcctcgcaca cattccacat ccaccggtag cgccaaccgg 1560 

ctccgttctt tggtggcccc ttcgcgccac cttctactcc tcccctagtc aggaagttcc 1620 

45 cccccgcccc gcagctcgcg tcgtgcagga cgtgacaaat ggaagtagca cgtctcacta 1680 

gtctcgtgca gatggacagc accgctgagc aatggaagcg ggtaggcctt tggggcagcg 1740 

gccaatagca gctttgctcc ttcgctttct gggctcagag gctgggaagg ggtgggtccg 1800 

ggggcgggct caggggcggg ctcaggggcg gggcgggcgc gaaggtcctc ccgaggcccg 1860 

• gcattctcgc acgcttcaaa agcgcacgtc tgccgcgctg ttctcctctt cctcatctcc 1920 

50 gggcctttcg acgatccagc cgccaccatg aaaaagcctg aactcaccgc gacgtctgtc 1980 

gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc tgatgcagct ctcggagggc 2040 

gaagaatctc gtgctttcag cttcgatgta ggagggcgtg gatatgtcct gcgggtaaat 2100 

agctgcgccg atggtttcta caaagatcgt tatgtttatc ggcactttgc atcggccgcg 2160 

ctcccgattc cggaagtgct tgacattggg gaattcagcg agagcctgac ctattgcatc 2220 

55 tcccgccgtg cacagggtgt cacgttgcaa gacctgcctg aaaccgaact gcccgctgtt 2280 

ctgcagccgg tcgcggaggc catggatgcg atcgctgcgg ccgatcttag ccagacgagc 2340 

gggttcggcc cattcggacc gcaaggaatc ggtcaataca ctacatggcg tgatttcata 2400 

tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg tgatggacga caccgtcagt 24 60 

gcgtccgtcg cgcaggctct cgatgagctg atgctttggg ccgaggactg ccccgaagtc 2520 

cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc tgacggacaa tggccgcata 2580 

acagcggtca ttgactggag cgaggcgatg ttcggggatt cccaatacga ggtcgccaac 2640 

atcttcttct ggaggccgtg gttggcttgt atggagcagc agacgcgcta cttcgagcgg 2700 

aggcatccgg agcttgcagg atcgccgcgg ctccgggcgt atatgctccg cattggtctt 27 60 

gaccaactct atcagagctt ggttgacggc aatttcgatg atgcagcttg ggcgcagggt 2820 

65 cgatgcgacg caatcgtccg atccggagcc gggactgtcg ggcgtacaca aatcgcccgc 2880 

agaagcgcgg ccgtctggac cgatggctgt gtagaagtac tcgccgatag tggaaaccga 2940 

cgccccagca ctcgtccgag ggcaaaggaa tagtcgatgc agaaattgat gatctattaa 3000 



40 



60 
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acaataaaga tgtccactaa aatggaagtt 
gaacagagta cctacatttt gaatggaagg 
gattagataa atgcctgctc tttactgaag 
catagttgga tatcataatt taaacaagca 
5 cactcatgat ctatagatct atagatctct 
actttgtggt tctaagtact gtggtttcca 
gatcagcagc ctctgttcca catacacttc 
tccatcagaa gcttcagctg ctcgactaga 
aggttttact tgctttaaaa aacctcccac 

10 atgcaattgt tgttgttaac ttgtttattg 
gcatcacaaa tttcacaaat aaagcatttt 
aactcatcaa tgtatcttat catgtctgga 
aact'gagaga actcaaaggt taccccagtt 
tccataactt cgtatagcat acattatacg 

15 agcttggcac tggccgtcgt tttacaacgt 
cttaatcgcc ttgcagcaca tccccctttc 
accgatcgcc cttcccaaca gttgcgcagc 
ccggcaccag aagcggtgcc ggaaagctgg 
gtcgtcgtcc cctcaaactg gcagatgcac 

20 acctatccca ttacggtcaa tccgccgttt 
. tcgctcacat ttaatgttga tgaaagctgg 
gatggcgtta actcggcgtt tcatctgtgg 
gacagtcgtt tgccgtctga atttgacctg 
ctcgcggtga tggtgctgcg ttggagtgac 

25 cggatgagcg gcattttccg tgacgtctcg 
gatttccatg ttgccactcg ctttaatgat 
gttcagatgt gcggcgagtt gcgtgactac 
gaaacgcagg tcgccagcgg caccgcgcct 
ggttatgccg atcgcgtcac actacgtctg 

30 gaaatcccga atctctatcg tgcggtggtt 
gaagcagaag cctgcgatgt cggtttccgc 
ctgaacggca agccgttgct gattcgaggc 
ggtcaggtca tggatgagca gacgatggtg 
tttaacgccg tgcgctgttc gcattatccg 

35 cgctacggcc tgtatgtggt ggatgaagcc 
aatcgtctga ccgatgatcc gcgctggcta 
gtgcagcgcg atcgtaatca cccgagtgtg 
cacggcgcta atcacgacgc gctgtatcgc 
gtgcagtatg aaggcggcgg agccgacacc 

40 gcgcgcgtgg atgaagacca gcccttcccg 
ctttcgctac ctggagagac gcgcccgctg 
aacagtcttg gcggtttcgc taaatactgg 
ggcggcttcg tctgggactg ggtggatcag 
ccgtggtcgg cttacggcgg tgattttggc 

45 aacggtctgg tctttgccga ccgcacgccg 
cagcagtttt tccagttccg tttatccggg 
ttccgtcata gcgataacga gctcctgcac 
gcaagcggtg aagtgcctct ggatgtcgct 
gaactaccgc agccggagag cgccgggcaa 

50 aacgcgaccg catggtcaga agccgggcac 
gaaaacctca gtgtgacgct ccccgccgcg 
gaaatggatt tttgcatcga gctgggtaat 
tttctttcac agatgtggat tggcgataaa 
ttcacccgtg caccgctgga taacgacatt 

55 aacgcctggg tcgaacgctg gaaggcggcg 
cagtgcacgg cagatacact tgctgatgcg 
catcagggga aaaccttatt tatcagccgg 
atggcgatta ccgttgatgt tgaagtggcg 
ctgaactgcc agctggcgca ggtagcagag 

60 gaaaactatc ccgaccgcct tactgccgcc 
gacatgtata ccccgtacgt cttcccgagc 
ttgaattatg gcccacacca gtggcgcggc 
caacagcaac tgatggaaac cagccatcgc 
ctgaatatcg acggtttcca tatggggatt 

65 tcggcggaat tccagctgag cgccggtcgc 
taataataac cgggcagggg ggatctttgt 
attggacaaa ctacctacag agatttaaag 
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tttcctgtca tactttgtta agaagggtga 3060 

attggagcta cgggggtggg ggtggggtgg 3120 

gctctttact attgctttat gataatgttt 3180 

aaaccaaatt aagggccagc tcattcctcc 3240 

cgtgggatca ttgtttttct cttgattccc 3300 

aatgtgtcag tttcatagcc tgaagaacga 3360 

attctcagta ttgttttgcc aagttctaat 3420 

ggatcataat cagccatacc acatttgtag 3480 

acctccccct gaacctgaaa cataaaatga 3540 

cagcttataa tggttacaaa taaagcaata -3600 

tttcactgca ttctagttgt ggtttgtcca 3660 

tccgtgtcat gtcggcgacc ctacgccccc 3720 
ggggcactac tcccgaaaac cgcttctgga 3780 

aagttatacc gggccaccat ggtcgcgagt 3840. 
cgtgactggg aaaaccctgg cgttacccaa 3900 

gccagctggc gtaatagcga agaggcccgc 3960 

ctgaatggcg aatggcgctt tgcctggttt 4020 

ctggagtgcg atcttcctga ggccgatact 4080 

ggttacgatg cgcccatcta caccaacgta 4140 
gttcccacgg agaatccgac gggttgttac 4200 
ctacaggaag gccagacgcg aattattttt 4260 
tgcaacgggc gctgggtcgg ttacggccag 4320 

agcgcatttt tacgcgccgg agaaaaccgc 4 380 

ggcagttatc tggaagatca ggatatgtgg 4 440 

ttgctgcata aaccgactac acaaatcagc 4500 

gatttcagcc gcgctgtact ggaggctgaa 4560 

ctacgggtaa cagtttcttt atggcagggt 4 620 

ttcggcggtg aaattatcga tgagcgtggt 4680 

aacgtcgaaa acccgaaact gtggagcgcc 4740 

gaactgcaca ccgccgacgg cacgctgatt 4800 

gaggtgcgga ttgaaaatgg tctgctgctg 4 860 

gttaaccgtc acgagcatca tcctctgcat 4 920 

caggatatcc tgctgatgaa gcagaacaac 4980 

aaccatccgc tgtggtacac gctgtgcgac 5040 

aatattgaaa cccacggcat ggtgccaatg 5100 

ccggcgatga gcgaacgcgt aacgcgaatg 5160 

atcatctggt cgctggggaa tgaatcaggc 5220 

tggatcaaat ctgtcgatcc ttcccgcccg 5280 

acggccaccg atattatttg cccgatgtac 5340 

gctgtgccga aatggtccat caaaaaatgg 5400 

atcctttgcg aatacgccca cgcgatgggt 5460 

caggcgtttc gtcagtatcc ccgtttacag 5520 

tcgctgatta aatatgatga aaacggcaac 5580 

gatacgccga acgatcgcca gttctgtatg 5640 

catccagcgc tgacggaagc aaaacaccag 5700 

caaaccatcg aagtgaccag cgaatacctg 5760 

tggatggtgg cgctggatgg taagccgctg 5820 

ccacaaggta aacagttgat tgaactgcct 5880 

ctctggctca cagtacgcgt agtgcaaccg 5940 

atcagcgcct ggcagcagtg gcgtctggcg 6000 

tcccacgcca tcccgcatct gaccaccagc 6060 

aagcgttggc aatttaaccg ccagtcaggc 6120 

aaacaactgc tgacgccgct gcgcgatcag 6180 

ggcgtaagtg aagcgacccg cattgaccct 6240 

ggccattacc aggccgaagc agcgttgttg 6300 

gtgctgatta cgaccgctca cgcgtggcag 6360 

aaaacctacc ggattgatgg tagtggtcaa 6420 

agcgatacac cgcatccggc gcggattggc 6480 

cgggtaaact ggctcggatt agggccgcaa 6540 

tgttttgacc gctgggatct gccattgtca 6600 

gaaaacggtc tgcgctgcgg gacgcgcgaa 6660 

gacttccagt tcaacatcag ccgctacagt 6720 

catctgctgc acgcggaaga aggcacatgg 6780 

ggtggcgacg actcctggag cccgtcagta 6840 

taccattacc agttggtctg gtgtcaaaaa 6900 

gaaggaacct tacttctgtg gtgtgacata 6960 

ctctaaggta aatataaaat ttttaagtgt 7020 
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ataatgtgtt aaactactga ttctaattgt 
tgatgaatgg gagcagtggt ggaatgccag 
ttggacaaac cacaactaga atgcagtgaa 
ctattgcttt atttgtaacc attataagct 
5 ttcattttat gtttcaggtt cagggggagg 
tctacaaatg tggtatggct gattatgatc 
ggtaaccgaa gttcctatac tttctaga.ga 
taagcgctag cctagaagat gggcgggagt 
gtgtgggcgt tgtcctgcag gggaattgaa 

10 cacagatttt cggttttgtc gggaagtttt 
ataggtagtc atctggggtt ttatgcagca 
cctcggagta ttttccatcg aggtagatta 
ctgcttgaga tccttactac agtatgaaat 
gaattttaat catttttaaa gagcccagta 

15 agccttatca aaaggtattt tagaacactc 
gcttatccaa cccctagaca gagcattggc 
tgactcatga aaccagacag attagttaca 
ctcaacactg cagttctttt ataactcctt 
tccttaattt tcagtgtcta tcacctctcc 

20 ctcagtccag ggagttttac aacaatagat 
tccactccca tgaatgcctc tctccttttt 
aatggttcca ggtggatgtc tcctccccat 
ctgatatttt aagacattaa aaggtatatt 
gcttactaaa attttgtcat tgtacacatc 

25 gttcaggtgt ttgttgtctt tcctgaccta 
aagcagtgct ttctcttgga ctggcttgac 
aaatgtgatt ttgccaagct tcttcaggac 
caagtaaaat gattaagcaa caaatgtatt 
gtgtgtgctt gtgctctata ataatactat 

30 agagcacaga ctgctcttcc agaagtcctg 
cacaaccatc tgtaatggga tctgatgccc 
attcacatta aataaataaa tcctccttct 
tgtctccagt agaatttact gaagtaatga 
caataatcaa attactcttt aagcactgga 

35 agtgtaactg tggacagagg agccataact 
agactttaat gtcttttctc ttacactaag 
atcctatttg tttaaactgc tagctttact 
aaagctaagt ctgcagccat tactaaacat 
aaaatgtagg gccagagttt agccagccag 

40 cagcactctg gaggcagaga caggcagatc 
tcaagttcta tctaggatag ccaggaatac 
tgagatttca taaaattata attgaagcat 
atccgtctac ctttctgatg agatttgggt 
gtcttttgac actgtgggct ttctttaaag 

45 ctactaactt cccatggctt aaatggcatg 
atttgcagcc tgatttccag ggtggggttg 
taattttttt tttaaaaaat gggttatata 
aggtggacta atattaaatg agtccctccc 
tatacttaac ttttttttta aatgtggtat 

50 atacagaaac tgttgcatcg cttaatcaga 
ttcttcacag ccaaagtcaa attaagaatt 
gaatataaaa atgatagctt ttcctgaggc 
gcaacaagat atgtagacta aagttctgcc 
atgtagtaat acttttggaa cttgcaggtc 

55 gcttgggtga tagttggtaa aatgtgtttc 
caacctactt tttaaaaaaa aaagccaggc 
acttgctgag cacacaagag tagttacttg 
aacaaggcag acaaccaaga aactacagtt 
tacacaggga tattaaaata ttccaaataa 

60 gggaca1:gga tttctccggt gaataggcag 
gatttgtgaa attgttttca agtgatagtt 
atttcgaggt ctcttggttt atactcagaa 
gatcatgtgc taggcctacc ttaggctgat 
aggtgatgtc atatgatttc atatatcaag 

65 tacttaatgt gaaagttagg tctttgtggg 
aataagtcat ttttacatgt cttacatttg 
agactctctg acctagtaac cctacctata 
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ttgtgtattt tagattccaa cctatggaac 7080 
atccagacat gataagatac attgatgagt 7140 
aaaaatgctt tatttgtgaa atttgtgatg 7200 
gcaataaaca agttaacaac aacaattgca 7260 
tgtgggaggt tttttaaagc aagtaaaacc 7320 
tgcggccaaa tcggccggcc taggcgcgcc 7380 
ataggaactt cggaatagga acttcaagct 74 40 
cttctgggca ggcttaaagg ctaacctggt 7500 
caggtgtaaa attggaggga caagacttcc 7560 
ttaatagggg caaataagga aaatgggagg 7620 
aaactacagg ttattattgc ttgtgatccg 7680 
aagacatgct cacccgagtt ttatactctc 7740 
tacagtgtcg cgagttagac tatgtaagca 7800 
cttcatatcc atttctcccg ctccttctgc 7860 
attttagccc cattttcatt tattatactg 7920 
attttccctt tcctgatctt agaagtctga 7980 
tacaccacaa atcgaggctg tagctggggc 8040 
agtacacttt ttgttgatcc tttgccttga 8100 
cgtcagtggt gttccacatt tgggcctatt 8160 
gtattgagaa tccaacctaa agcttaactt 8220 
ctccatttat aaactgagct attaaccatt 8280 
attacctgat gtatcttaca tattgccagg 8340 
tcattattga gccacatggt attgattact 8400 
tgtaaaaggt ggttcctttt ggaatgcaaa 84 60 
aggtcttgtg agcttgtatt ttttctattt 8520 
tcatggcatt ctacacgtta ttgctggtct 8580 
ctataatttt gcttgacttg tagccaaaca 8640 
tgtgaagctt ggtttttagg ttgttgtgtt 8700 
ccaggggctg gagaggtggc tcggagttca 8760 
agttcaattc ccagcaacca catggtggct 8820 
tcttctggtg tgtctgaaga ccacaagtgt 8880 
tcttcttttt ttttttttta aagagaatac 8940 
aatactttgt gtttgttcca atatggtagc 9000 
aatgttacca aggaactaat ttttatttga 9060 
gcagacttgt gggatacaga agaccaatgc 9120 
caataaagaa ataaaaattg aacttctagt 9180 
taacttttgt gcttcatcta tacaaagctg 924 0 
gaaagcaagt aatgataatt ttggatttca 9300 
tggtggtgct tgcctttatg cctttaatcc 9360 
tctgagtttg agcccagcct ggtctacaca 9420 
acacagaaac cctgttgggg aggggggctc 9480 
tccctaatga gccactatgg atgtggctaa 9540 
attatttttt ctgtctctgc tgttggttgg 9600 
cctccttcct gccatgtggt ctcttgtttg 9660 
gctttttgcc ttctaagggc agctgctgag 9720 
ggaaatcttt caaacactaa aattgtcctt 9780 
ataaacctca taaaatagtt atgaggagtg 984 0 
ctataaaaga gctattaagg ctttttgtct 9900 
ctttagaacc aagggtctta gagttttagt 9960 
ttttctagtt tcaaatccag agaatccaaa 10020 
tctgactttt aatgttaatt tgcttactgt 10080 
agggtctcac tatgtatctc tgcctgatct 10140 
tgcttttgtc tcctgaatac taaggttaaa 10200 
agattctttt ataggggaca cactaaggga 102 60 
aagtgatgaa aacttgaatt attatcaccg 10320 
ctgttagagc atgcttaagg gatccctagg 10380 
gcaggctcct ggtgagagca tatttcaaaa 104 40 
aaggttacct gtctttaaac catctgcata 10500 
tatttcattc aagttttccc ccatcaaatt 10560 
agttggaaac taaacaaatg ttggttttgt 10620 
aaagcccatg agatacagaa caaagctgct 10680 
gcacttcttt gggtttccct gcactatcct 10740 
tgttgttcaa ataaacttaa gtttcctgtc 10800 
gcaaaacatg ttatatatgt taaacatttg 108 60 
tttgattttt aattttcaaa acctgagcta 10920 
gtggaattgt ataattgtgg tttgcaggca 10980 
gagcactttg ctgggtcaca agtctaggag 11040 
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15 



20 



25 



30 



40 



45 



50 



60 



65 



tcaagcattt 
ggacatgttt 
gattagcact 
tctcaaataa 
taatttttgt 
aagccatatt 
tatagccctg 
ccgcctgcct 
gttggatatt 
cagtcagtag 
cagaggctgt 
gggtcaggga 
tctgatagaa 



caccttgaag 
atccagaaga 
gttagtgagc 
tgctggcctt 
tcaaagaaat 
ttttttcctt 
gctgtcctgg 
ctgcctcctg 
ttgttatata 
tcttaagtgg 
tggtactagt 
tagaaactag 
atatttcagg 



ttgagacgtt 
tattcaggac 
attgagtggc 
ttttaaaaag 
acttgtttgg 
tttttttttt 
aactcacttt 
agtgccggga 
actataacca 
tctttattgg 
ggcacttaag 
tctagcgttt 
acat 
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ttgttagtgt 
tatttttgac 
ctttaggctt 
cccttgttct 
atctcctttt 
tttttggttt 
gtagaccagg 
ttaaaggcgt 
atactaactc 
cccttcatta 
caacttccta 
tgtataccta 



atactagttt 
tgggctaagg 
gaattggagt 
ttatcaccct 
gacaacaata 
ttcgagacag 
ctggcctcga 
gcaccaccac 
cactgggtgg 
aaatctactg 
cggatatact 
ccagctttat 



atatgttgga 
aattgattct 
cacttgtata 
gttttctaca 
gcatgttttc 
ggtttctctg 
actcagaaat 
gcctggctaa 
atttttaatt 
ttcactctaa 
agcagattaa 
actaccttgt 



11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11784 



<210> 107 

<211> 1458 

<212> DNA 

<213> Bacteriophage TP901-1 

<220> 

<221> CDS 

<222> (1)..(1455) 

<400> 107 

atg act aag aaa gta gca ate tat aca cga gta tec act act aac caa 

Met Thr Lys Lys Val Ala He Tyr Thr Arg Val Ser Thr Thr Asn Gin 
1 5 io 15 

gca gag gaa ggg ttc tea att gat gag caa att gac cgt tta aca aaa 
Ala Glu Glu Gly Phe Ser He Asp Glu Gin He Asp Arg Leu Thr Lys 
20 25 30 



tat get gaa gca atg ggg tgg caa gta tct gat act tat act gat get 
M Tyr Ala Glu Ala Met Gly Trp Gin Val Ser Asp Thr Tyr Thr Asp Ala 
35 40 45 



aac gat ate gag aat aaa get ttt gat aca gtt ctt gta tat aag eta 
Asn Asp He Glu Asn Lys Ala Phe Asp Thr Val Leu Val Tyr Lys Leu 
65 70 75 80 



get ttt ggg tat tac cac aac aga aag aca ggt ata tta gaa att gtt 
Ala Phe Gly Tyr Tyr His Asn Arg Lys Thr Gly He Leu Glu He Val 



48 



96 



144 



ggt ttt tea ggg gee aaa ctt gaa cgc cca gca atg caa aga tta ate 192 
Gly Phe Ser Gly Ala Lys Leu Glu Arg Pro Ala Met Gin Arg Leu He 
50 55 60 • 



240 



gac cgc ctt tea cgt agt gta aga gat act ctt tat ctt gtt aag gat 288 

Asp Arg Leu Ser Arg Ser Val Arg Asp Thr Leu Tyr Leu Val Lys Asp 
85 90 95 

gtg ttc aca aaa aat aaa ata gac ttt ate teg ctt aat gaa agt att 336 

Val Phe Thr Lys Asn Lys He Asp Phe He Ser Leu Asn Glu Ser He 
100 105 no 

gat act tct tct get atg ggt age ttg ttt etc act att ctt tct gca 384 

M Asp Thr Ser Ser Ala Met Gly Ser Leu Phe Leu Thr He Leu Ser Ala 

115 120 125 



att aat gag ttt gaa aga gag aat ata aaa gaa cgc atg act atg ggt 432 

He Asn Glu Phe Glu Arg Glu Asn He Lys Glu Arg Met Thr Met Gly 
130 135 140 

aaa eta ggg cga gcg aaa tct ggt aag tct atg atg tgg act aag aca 

Lys Leu Gly Arg Ala Lys Ser Gly Lys Ser Met Met Trp Thr Lys Thr 
145 150 155 160 



480 



528 
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165 170 175 



cct tta caa get aca ata gtt gaa caa ata ttc act gat tat tta tea 57 6 
Pro Leu Gin Ala Thr lie Val Glu Gin He Phe Thr Asp Tyr Leu Ser 
180 185 190 

gga ata tea ctt aca aaa tta aga gat aaa etc aat gaa tct gga cac 624 
Gly He Ser Leu Thr Lys Leu Arg Asp Lys Leu Asn Glu Ser Gly His 
195 200 205 

ate ggt aaa gat ata ccg tgg tct tat cgt acc eta aga caa aca ctt 672 
He Gly Lys Asp He Pro Trp Ser Tyr Arg Thr Leu Arg Gin Thr Leu 
210 215 220 

gat aat cca gtt tac tgt ggt tat ate aaa ttt aag gac age eta ttt 720 
Asp Asn Pro Val Tyr Cys Gly Tyr He Lys Phe Lys Asp Ser Leu Phe 
225 230 235 240 

gaa ggt atg cac aaa cca att ate cct tat gag act tat tta aaa gtt 768 
Glu Gly Met His Lys Pro He He Pro Tyr Glu Thr Tyr Leu Lys Val 
245 250 " 255 

caa aaa gag eta gaa gaa aga caa cag cag act tat gaa aga aat aac 816 
Gin Lys Glu Leu Glu Glu Arg Gin Gin Gin Thr Tyr Glu Arg Asn Asn 
K> 260 265 270 



aac cct aga cct ttc caa get aaa tat atg ctg tea ggg atg gca agg 864 
Asn Pro Arg Pro Phe Gin Ala Lys Tyr Met Leu Ser Gly Met Ala Arg 
275 280 285 

tgc ggt tac tgt gga gca cct tta aaa att gtt ctt ggc cac aaa aga 912 
Cys Gly Tyr Cys Gly Ala Pro Leu Lys lie Val Leu Gly His Lys Arg 
290 295 ~ 300 

aaa gat gga age cgc act atg aaa tat cac tgt gca aat aga ttt cct 960 
Lys Asp Gly Ser Arg Thr Met Lys Tyr His Cys Ala Asn Arg Phe Pro 
305 310 315 320 

cga aaa aca aaa gga att aca gta tat aat gac aat aaa aag tgt gat 1008 
Arg Lys Thr Lys Gly He Thr Val Tyr Asn Asp Asn Lys Lys Cys Asp 
.325 330 335 

tea gga act tat gat tta agt aat tta gaa aat act gtt att gac aac 1056 
Ser Gly Thr Tyr Asp Leu Ser Asn Leu Glu Asn Thr Val He Asp Asn 
4 ^ 340 345 350 

ctg att gga ttt caa gaa aat aat gac tec tta ttg aaa att ate aat 1104 
Leu He Gly Phe Gin Glu Asn Asn Asp Ser Leu Leu Lys lie lie Asn 
355 360 365 

ggc aac aac caa cct att ctt gat act teg tea ttt aaa aag caa att 1152 
Gly Asn Asn Gin Pro He Leu Asp Thr Ser Ser Phe Lys Lys Gin He 
370 375 380 

tea cag ate gat aaa aaa ata caa aag aac tct gat ttg tac eta aat 1200 
Ser Gin He Asp Lys Lys lie Gin Lys Asn Ser Asp Leu Tyr Leu Asn 
385 390 395 " " 400 

gat ttt ate act atg gat gag ttg aaa- gat cgt act gat tec ctt cag 1248 
Asp Phe lie Thr Met Asp Glu Leu Lys Asp Arg Thr Asp Ser Leu Gin 
405 410 415 



get gag aaa aag ctg ctt aaa get aag att age gaa aat aaa ttt aat 1296 

Ala Glu Lys Lys Leu Leu Lys Ala Lys lie Ser Glu Asn Lys Phe Asn 
65 420 425 430 

gac tct act gat gtt ttt gag tta gtt aaa act cag ttg ggc tea att 1344 
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Asp Ser Thr Asp Val Phe Glu Leu Val Lys Thr Gin Leu Gly Ser lie 
435 440 445 

ccg att aat gaa eta tea tat gat aat aaa aag aaa ate gtc aac aac 1392 

Pro He Asn Glu Leu Ser Tyr Asp Asn Lys Lys Lys He Val Asn Asn 
450 455 460 

ctt gta tea aag gtt gat gtt act get gat aat gta gat ate ata ttt 14 40 

Leu Val Ser Lys Val Asp Val Thr Ala Asp Asn Val Asp He He Phe 

465 470 475 ' 480 



aaa ttc caa etc get taa 
Lys Phe Gin Leu Ala 
485 



<210> 108 
<211> 485 
<212> PRT 

<213> Bacteriophage TP901-1 
<400> 108 

Met Thr Lys Lys Val Ala He Tyr Thr Arg Val Ser Thr Thr Asn Gin 
15 10 15 

Ala Glu Glu Gly Phe Ser He Asp Glu Gin He Asp Arg Leu Thr Lys 
20 25 30 

Tyr Ala Glu Ala Met Gly Trp Gin Val Ser Asp Thr Tyr Thr Asp Ala 
35 40 45 

Gly Phe Ser Gly Ala Lys Leu Glu Arg Pro Ala Met Gin Arg Leu He 
50 55 60 

Asn Asp lie Glu Asn Lys Ala Phe Asp Thr Val Leu Val Tyr Lys Leu 
65 70 75 80 

Asp Afg Leu Ser Arg Ser Val Arg Asp Thr Leu Tyr Leu Val Lys Asp 
85 90 " 95 

Val Phe Thr Lys Asn Lys He Asp Phe He Ser Leu Asn Glu Ser He 
100 105 110 

Asp Thr Ser Ser Ala Met Gly Ser Leu Phe Leu Thr He Leu Ser Ala 
115 120 125 

He Asn Glu Phe Glu Arg Glu Asn He Lys Glu Arg Met Thr Met Gly 
130 135 140 

Lys Leu Gly Arg Ala Lys Ser Gly Lys Ser Met Met Trp Thr Lys Thr 
145 150 155 160 

Ala Phe Gly Tyr Tyr His Asn Arg Lys Thr Gly He Leu Glu He Val 
165 170 175 

Pro Leu Gin Ala Thr He Val Glu Gin He Phe Thr Asp Tyr Leu Ser 
180 185 190 

Gly He Ser Leu Thr Lys Leu Arg Asp Lys Leu Asn Glu Ser Gly His 
195 200 205 

He Gly Lys Asp He Pro Trp Ser Tyr Arg Thr Leu Arg Gin Thr Leu 
,210 215 " 220 

Asp Asn Pro Val Tyr Cys Gly Tyr He Lys Phe Lys Asp Ser Leu Phe 
225 230 235 240 



1458 
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Glu Gly Met His Lys Pro He He Pro Tyr Glu Thr Tyr Leu Lys Val 
245 250 255 

Gin Lys Glu Leu Glu Glu Arg Gin Gin Gin Thr Tyr Glu Arg Asn Asn 
•> 260 265 . 270 

Asn Pro Arg Pro Phe Gin Ala Lys Tyr Met Leu Ser Gly Met Ala Ara 
275 280 285 

10 Cys Gly Tyr Cys Gly Ala Pro Leu Lys He Val Leu Gly His Lys Arg 
290 295 " 300 

Lys Asp Gly Ser Arg Thr Met Lys Tyr His Cys Ala Asn Arg Phe Pro 
305 310 ~ 315 320 

Arg Lys Thr Lys Gly He Thr Val Tyr Asn Asp Asn Lys Lys Cys Asp 
325 330 335 

Ser Gly Thr Tyr Asp Leu Ser Asn Leu Glu Asn Thr Val lie Asp Asn 
20 - 340 345 350 

Leu . He Gly Phe Gin Glu Asn Asn Asp Ser Leu Leu Lys He He Asn 
355 360 365 

25 Gly Asn Asn Gin Pro He Leu Asp Thr Ser Ser Phe Lys Lys Gin lie 
370 375 380 

Ser Gin lie Asp Lys Lys lie Gin Lys Asn Ser Asp Leu Tyr Leu Asn 
385 390 395 400 

Asp Phe He Thr Met Asp Glu Leu Lys Asp Arg Thr Asp Ser Leu Gin 
405 410 * 415 

Ala Glu Lys Lys Leu Leu Lys Ala Lys lie Ser Glu Asn Lys Phe Asn 
35 420 425 430 

Asp Ser Thr Asp Val Phe Glu Leu Val Lys Thr Gin Leu Gly Ser lie 
435 440 445 

40 Pro He Asn Glu Leu Ser Tyr Asp Asn Lys Lys Lys He Val Asn Asn 
450 455 460 

Leu Val Ser Lys Val Asp Val Thr Ala Asp Asn Val Asp He He Phe 
465 470 475 480 
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Lys Phe Gin Leu Ala 
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RLUxlO 5 (Gal/Luci) 


34 ±18 


164 ±54 
443 ±151 


163 ±36 
500 ±65 


755 ±601 i 


906 ±316 
879 ±291 


694 ± 345 
874 ±741 


RLU (Luciferase) 


3631598 ±903012 


2741969 ±667568 
3798872 ±1288020 


2471695 ±61 1351 
3570103 ±750628 


195822 ±81858 


119043 ± 67451 
122557 ±30054 


174380 ± 58876 
21 1182 ±101011 


RLU (pGai) 


1324 ±876 


4650 ± 2273 
17529 ±9304 


4060 ±1376 
17801 ±3892 


754 ±70 


925 ±273 
1033 ±270 


1108±367 
1306 ±383 




1) pPGKnifD (reporter) only 


2) pCMV-XisA 25 ng 

3) pCMV-XisA 100 ng 
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6) pPGKattA (reporter) only 


7) pCMV-SSV 10 ng 

8) pCMV-SSV 20 ng 


9) pCMV-SSV(NNLS) 10 ng 

10) pCNIV-SSV(NNLS) 20 ng 
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A.: Nontransfected 
control 



B.: pCMV-Cre 



C: pCMV-C31lnt(NLS) 



Fig. 5 
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